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Preface 



This volume includes the proceedings of the NATO Advance Study Institute on 
’’Cooperation: Game Theoretic Approaches”, that took place at Stony Brook, NY, 
USA, from July 18 to July 29, 1994. 

The Institute was a success and it is already part of a well established biannual 
Stony Brook tradition to which many researchers around the world look forward 
to. 

We thank the institute for Decision Sciences of the State University of New York 
at Stony Brook for hosting this event. It is a particular pleasure to thank Colleen 
Wallahora and Eileen Zapia, for the very successful organization of this ASI. 
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Introduction 



Sergiu Hart® and Andreu Mas-Colell* 



® The Hebrew University of Jerusalem 
^ Harvard University and Universitat Pompeu Fabra, Barcelona 



This book constitutes a systematic exposition of the various game theoretic 
approaches to the issue of cooperation. 

Game Theory is the study of decision making in multi-person situations, where 
the outcome depends on everyone's choice. The goal of each participant is to 
maximize his own utility, while taking into account that the other participants are 
doing the same. In such interactive situations, cooperation between the agents 
may lead to results that are better, for everyone, that the non-cooperative 
outcomes. A simple - but extensively studied - example is the so-called 
“Prisoners’ Dilemma” : Assume each on of the two players can ask a generous 
donor either to give him 1 million dollars, or to give 4 million dollars to the other 
player, the donor will carry out the instructions of both players (thus, for example, 
if player 1 asks for $1M to himself and player 2 asks for $4M to the other, then 
player 1 gets $5M and payer 2 gets nothing). Clearly, whatever the other player 
does, it is strictly better for each player to ask for $1M to himself (more precisely, 
it will always lead to an additional “ IM). This yields $1M for each; cooperation, 
whereby each one asks for $4M to the other, would have yielded each $4M 
instead! The Prisoner’s Dilemma is by no means an artificial example. The 
economic competition between firms exhibits similar phenomena: keeping a 
commodity in short supply may be to the advantage of all producers; at the same 
time, it may be better for any single producer to unilaterally increase his own 
production. 

The problems that need to be addressed are, first, whether cooperation can be 
reached at all; second, by what procedures are agreements reached; and third, 
which ones will be indeed attained. This volume will survey some of the 
contributions of game theory to these questions, from its early traditional theories 
to its current approaches. 

Game theoretical approaches are usually classified as either “ cooperative” or 
“ non-cooperative” . This should not be viewed as an exclusive division; these are 
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two ways of looking at the same problem. The Introductory Remarks of R. J. 
Aumann that follow this Introduction address this point in some detail. 

Part A, which opens this volume, surveys the classical cooperative approach. 
This starts by assuming that binding agreements are possible, and it abstracts 
away from the detailed bargaining procedures. The selection of the appropriate 
cooperative outcome is usually based on a set of desired postulates or axioms, 
which, when applied to a class of problems, characterize one or another solution 
concept. Chapters 1 and 2 by W. Thomson cover the pure bargaining problems, 
where only the grand coalition of all players can reach a beneficial agreement; 
Chapter 1 deals with the classical approaches that originate with Nash’s 1950 
seminal paper, and Chapter 2 deals with recent axiomatizations based on internal 
consistency properties (the “reduced game property”). Chapters 3 and 4, by S. 
Hart, survey the general ^-person problems where subcoalitions of players can 
reach agreements as well, and this of course influences the final outcome. The 
classical cooperative solution concepts that arise are grouped into “core-like” 
notions and “value-like” notions. The former include the core, the stable sets of 
von Neumann and Morgenstem, the bargaining set, the kernel and the nucleolus; 
the latter include the Nash bargaining solution, the Shapley value and their many 
extensions and generalizations. Chapter 5 by B. Allen deals with games of 
incomplete information, i.e. , games where some of the participants may possess 
private information not known to the others. Here, the questions of cooperation 
are further complicated by the need to address the informational issues; how to 
ensure that the players have incentive to reveal the appropriate information. 

Part B is devoted to non-cooperative approaches, namely, non-cooperative 
models that lead to cooperative solutions. One may start from a non-cooperative 
bargaining model, like the Stahl-Rubinstein “alternating offers” procedure, 
characterize its strategic equilibria, and relate the resulting outcomes to various 
cooperative solutions. Or, one may start from a cooperative solution, and construct 
games whose equilibria yield precisely this given solution. Either way, one 
establishes connections between non-cooperative and cooperative setups, that 
further strengthen and reinforce one another. In the literature, all this is usually 
referred as “bargaining procedures”, “non-cooperative foundations”, or 
“ implementation” . The distinctions are not always clear , in particular since some 
of the recent implementation literature is concerned with “ natural” and “ simple” 
games. Chapter 6 by A. Mas-Colell covers bargaining procedures that lead to 
value-like cooperative solutions, and the second part of Chapter 7 by P. Reny and 
Chapter 8 by B. Allen, for the case of complete information and incomplete 
information, respectively. Chapter 9 by R. Vohra discusses coalitional non- 
cooperative approaches -i.e., models where not only individuals, but also 
coalitions may act strategically. Chapter 10 by J. Greenberg surveys the theory of 
“ social situations” , which looks for a stable standards of behavior in general 
coalitional interactions. 
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Part C deals with dynamic models, that is models of long-term interactions 
between the participants. Returning, for example, to the Prisoners’ Dilemma, it 
seems clear that if die same participants play it again and again , then cooperation 
may indeed be attained. However, this is by no means always so; for instance, in a 
fixed finite-horizon repetition, it is very difficult to escape the non-cooperative 
outcome of $1M each. There is by now a large and deep literature on “repeated 
games” -starting with the so-called “Folk Theorem” - that shows the extent to 
which cooperation may arise. The complete information case is covered in 
Chapter 11 by S. Sorin\ Chapter 12, also by S. Sorin, then goes on to survey 
models of communication; namely, one examines the effect of the players being 
able to communicate among themselves before the game is played, and also, in the 
case of a multi-stage game, during the play. This leads to correlation and 
cooperation. Another important issue in multi-stage interactions is that they 
require, by their very nature, extremely complex strategic considerations. This 
suggests considering models where the assumption that players are fully rational 
- i.e., that they are restricted in one way or another in their choices. Chapter 13 by 
R.J. Aumann discusses some of the underlying ideas and approaches of this kind. 
The case where strategies are implemented by automata of bounded complexity is 
then studied in Chapter 14 by A. Neyman. Chapter 15 by V. Krishna and T. 
Sjdstrom is devoted to a simple but interesting learning model, known as the 
“fictitious play” ; players assume that the past behavior of their opponents is, in a 
certain sense, and appropriate predictor of their future behavior. Chapter 16, also 
by V. Krishna and T. Sjdstrdm, studies another type of bounded rationality 
models: the “evolutionary models “. These are based on the biological paradigm 
of natural selection and evolution, where there is no conscious optimization at all; 
instead, it is the dynamics of the evolution of the population that leads ultimately 
to equilibria and stable outcomes. 

Part D, that concludes this volume, is concerned with “descriptive” results. 
One looks at the actual behavior of participants in various interactive situations. 
The question is not “what should rational players do”, but rather “what do they 
do” in specific experiments. Chapter 17 by R. Selten surveys some of the large 
literature on experimental game theory, in particular relating to issues of 
cooperation. Since the outcomes are at times at odds with those predicted by the 
various theories of rational behavior, there is much need to understand what 
exactly are the principles leading to the different behaviors. 



^ The incomplete information case was also covered in the lectures. The reader is referred 
to the Handbook of Game Theory with Economic Applications (edited by R. J. Aumarm 
and S. Hart, North-Holland, volume I: 1992, volumeii: 1994, volume HI: forthcoming), for 
surveys of this topic (see Chapters 5 and 6 volume I), as well as of many other related 
topics. 




Introductory Remarks 



Robert J. Aumann 



The Hebrew University of Jerusalem 



There is a broad division of game theory into two approaches: the cooper- 
ative approach and the noncooperative approach. These approaches should 
not be considered as analyzing different kinds of games; rather, they are dif- 
ferent ways of looking at the same game. As Joachim Rosenmiiller has said, 
the game is one “ideal” of which the cooperative and the noncooperative 
approaches are two “shadows”. 

The noncooperative theory is strategy oriented. It studies what we ex- 
pect the players to do in the game. The cooperative theory, on the other 
hand, studies the outcomes we expect. In the cooperative approach we look 
directly at the space of outcomes, not the nitty-gritty of how one gets there. 
The noncooperative theory is a kind of micro theory; it involves precise de- 
scriptions of what happens. In the cooperative theory we are interested in 
what the players can achieve] thus we ask how coalitions can form, what 
coalitions will form and how the coalitions that do form divide what they 
achieve. 

Why do we call that shadow of the game “cooperative”? “Cooperation” 
seems to indicate more that that. Indeed, though this term is somewhat 
misleading, it does have a basis in the theory. In the cooperative theory we 
are interested in feasible outcomes. Thus anything that the players could 
get is taken into consideration, even if it is not incentive compatible for 
them. For example, in the prisoner’s dilemma we are interested also in 
the cooperative outcome. This is done by assuming that the players have 
enforceable contracts available to them; i.e., they can make commitments. 
The players can get into a coalition and agree on a joint course of action, 
and hence on outcomes; and it is assumed that the players must honor their 
commitments. We assume that there is some mechanism, like a court, that 
enforces these contracts, so that all feasible outcomes should be considered. 
This idea of writing a contract is at least reminiscent of cooperative action. 

The distinction between cooperative and noncooperative goes back to 
the dawning of game theory. It appears already in the works of Nash and 
others in the early fifties, and I remember a conference in 1955 (attended 
by von Neumann and Morgenstern) where the issue of cooperative vis-a- 
vis noncooperative was discussed. However it was only in the 60 ’s that 
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Harsanyi had the insight of distinguishing commitment as differentiating the 
cooperative from the noncooperative model. 

Summing up, the cooperative (or coalitional) approach studies games 
from a macro point of view, focusing on the feasible outcomes that can be 
obtained by enforceable commitments. 




PART A 



CLASSICAL 
COOPERATIVE THEORY 





Cooperative Theory of Bargaining I: Classical 

William Thomson 

Department of Economics, University of Rochester 
Rochester, NY 14627, USA 



In these notes we deal with the so called “bargaining problem” (Nash, 1950). 
Our approach is axiomatic. We search for solutions satisfying some desirable 
properties (axioms). 

1 Domain 

Let N = {1,2, ...,n} be the set of agents. A bargaining problem is a 
pair (5, d) interpreted as follows: The group of agents N can get any point 
in the feasible set S C IR^ if they agree on it, and they get d G 5, the 
disagreement point, if they fail to agree on any point. We assume that 

• 5 is a compact and convex set, 

• There is at least one point in S that strictly dominates d. 

(See Figure 1). 




Figure 1: Bargaining problem 
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For convenience we also assume that 

• c/ = 0, 

• 5eiR^, 

• 5 is comprehensive: If x € 5 and x > y > 0, then y G S} 

(See Figure 2a). 

Comprehensiveness of the feasible set is an implication of the assumption of 
free disposal of utility. We sometimes limit our attention to strictly compre- 
hensive problems; that is problems such that the Pareto efficient boundary 
of the feasible set does not contain a segment parallel to an axis. For such a 
problem utility transfers are always possible along the boundary. (See Figure 
2b). As we fixed the disagreement point at d = 0 it suffices to specify a 
feasible set to define a bargaining problem. 




Figure 2: (a) A comprehensive problem (b) a strictly comprehensive problem 

. An example of a bargaining problem is the image of an exchange economy 
in utility space. (Take the image of the feasible allocations to be the feasible 
set S and the image of the endowment u to be the disagreement point d. See 
Figure 3.) 

We denote the class of n-person problems with d = 0 by Eq . 



^ Given x,x' £ > x* means Xi > xj for all i. 
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(a) (b) 

Figure 3: Construction of a bargaining problem from an exchange economy 



2 Solutions 

A solution F is a method to choose a feasible point for each problem. For- 
mally, a solution is a function F from Eg to such that F[S) G S for all 
5gS^ 

Usually on economic domains most solutions are multi-valued, but in 
bargaining theory there are many interesting single-valued solutions and the 
theory has been mainly developed under the assumption of single- valuedness. 
We first define a number of the solutions that have been considered, starting 
with the most natural ones. 

The idea of equal gains is central to economic reasoning in a variety of 
contexts. This idea is the motivation for the following solution: 

Egalitarian solution, E (Figure 4a): E{S) is the maximal point of S of 
equal coordinates.^ 

The next solution can be seen as a normalized version of the egalitarian so- 
lution. Define the ideal point of 5, a(5) as follows: a,- (5) = max{x,- | x G 5} 

^When the bargaining problem is presented in classroom to students that have never 
been exposed to the theory, the solution that they come up most often is the egalitarian 
solution. The second most often proposed is the Kalai-Smorodinsky solution. 
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for all i. The Kalai-Smorodinsky solution sets utility gains from the origin 
proportional to the agents’ maximal utilities. 

Kalai-Smorodinsky solution, K (Figure 4b): K{S) is the maximal point 
of 5 on the segment connecting the origin to a[S). 

The best known solution, introduced by Nash (1950), select the point at 
which the product of utility gains is maximized. 

Nash solution, N (Figure 4c): N{S) is the maximizer of the product Yl^i 
over S. 

The next two solutions are extreme cases of solutions favoring one agent 
at the expense of others. Solutions in the same spirit often appear in social 
choice theory. 

Dictatorial solutions, D* and D** (Figure 4d): D*(5) is the maximal 
point X of 5 with Xj = 0 for all j ^ i. D**{S) is the Pareto optimal point 
with maximal coordinate. 

The next two solutions are representatives of a family of solutions based 
on processes of balanced concessions: imagine agents working their way from 
their preferred alternatives to a final position by moving from compromise to 
compromise. For the solution defined first, the initial compromise is obtained 
by choosing the halfway point between agents’ most preferred alternatives. 
This point is not in general Pareto efficient. The next step is choosing the 
halfway point between the agents’ most preferred alternatives among these 
alternatives that both prefer to the previous compromise. The procedure is 
repeated until a Pareto efficient allocation is reached. 

The discrete Raiffa solution, (Figure 4e): R^{S) is the limit point 

of the sequence {z^} defined by: = D*(5) for all i; for all t G IN, = 

and is the weakly Pareto optimal point with x'j = zj for 

all j i. 

The definition of the next solution might not be very transparent but this 
solution has a very appealing characterization based on an additivity condi- 
tion. 

The Perles-Maschler solution, PM (Figure 4f): For n = 2. If dS (the 
weakly Pareto optimal boundary of S) is polygonal, PM(S) is the com- 
mon limit point of the sequences defined by: x° = D*^{S),y^ = 

D*^(5); for each f G IN, x^ are Pareto optimal with x\>y\, the segments 
[x^“\x^], are Pareto optimal and the products |(x^”^ — 
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I and 1 ( 2 / 1 ”^— yi) ( 2 / 2 ”^ ~ 2 ^ 2 ) I equal and maximal. If 55 is not polygonal, 

PM{S) is defined by approximating 5 by a sequence of polygonal problems 
and taking the limit of associated solution outcomes. 

The next solution exemplifies a family of solutions for which compromises 
are evaluated globally: 

The Equal Area solution, A (Figure 4g): For n = 2. A{S) is the Pareto 
optimal point x such that the area of 5 to the right of the vertical line through 
X is equal to the area of 5 above the horizontal line through x. 

The next solution plays an important role in welfare economics: 

Utilitarian solution, U (Figure 4h): U{S) is a maximizer in 5 of 
Note that the utilitarian solution is not single- valued in general. To obtain a 
well-defined solution one needs to make a selection from the set of maximizers. 

Agents cannot simultaneously obtain their preferred outcomes. An intu- 
ively appealing idea is to try to get as close as possible to satisfying everyone. 
The following one-parameter family of solutions reflects the flexibility that 
exists in measuring how close two points are from each other: 

Yu solutions, (Figure 4i): Given p G (l,oo), y^(5) is the point of 
5 for which the p-distance to the ideal point of 5, (X^|a»(5) — is 

minimal. 





(b) 



Figure 4a-b: Solutions (a) the Egalitarian solution, (b) the Kalai-Smorodinsky 
solution 





Figure 4c-i: Solutions (c) the Nash solution, (d) the Dictatorial solutions, 
(e) the Discrete RaifFa solution, (f) the Perles-Maschler solution, (g) the 
Equal Area solution, (h) the Utilitarian solution, (i) the Yu solutions 
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3 Axioms and Main Characterizations 

In this section we introduce some desirable properties of solutions (the ax- 
ioms) and study their implications. The first axiom we study requires that 
if opportunities expand, then all agents should (weakly) gain. 

Strong monotonicity: For all 5,5' G Eg if 5' D 5, then F(5') > F{S). 
(See Figure 5a). 





Figure 5: (a) Strong monotonicity: if opportunities expand, all agents 
should (weakly) gain (b) the egalitarian solution is strongly monotonic 

The egalitarian solution is strongly monotonic (See Figure 5b), but nei- 
ther the Kalai-Smorodinsky solution, nor the Nash solution is. One wonders 
if there are other solutions than the egalitarian solution that are strongly 
monotonic. Of course the solution that selects the disagreement point for all 
problems is strongly monotonic. However it is not Pareto optimal., and not 
even weakly Pareto optimal: 

Pareto optimality: For all 5 G Eg, 

F(5) G PO{S) = {x G 5 I G 5 with x' > x} 

Weak Pareto optimality: For all 5 G Eg, 

F{S) G WPO{S) = {x G 5 I ^x' G 5 with x' > x} 

Note that the egalitarian solution is only weakly Pareto optimal. It is 
Pareto optimal on the class of strictly comprehensive problems. 
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What about the solution that selects the maximal point on the 30® line 
for all problems? (See Figure 6a). This solution is strongly monotonic^ as 
well as weakly Pareto optimal. More generally, consider the following class of 
solutions: Fix a monotone path in IR^ emanating from the origin, and for 
each problem choose the maximal feasible point on this path. We call any 
such solution a monotone path solution. (See Figure 6b). 





(a) (b) 

Figure 6; (a) The solution that selects the maximal point on the 30® line is 
strongly monotonic (b) monotone path solutions are strongly monotonic 

Monotone path solutions are strongly monotonic and weakly Pareto opti- 
mal. But none of them (except the egalitarian solution) is symmetric: 

Symmetry: If S is invariant under all exchanges of agents, Fi{S) = Fj(S) 
for all i,j. 

Next we have the first theorem: 

Theorem 1 (Kalai, 1977): The egalitarian solution is the only solution sat- 
isfying weak Pareto optimality, symmetry, and strong monotonicity. 

Proof. It is easy to show that the egalitarian solution satisfies weak Pareto 
optimality, symmetry, and strong monotonicity. Conversely let F satisfy the 
three axioms. We first consider the class of strictly comprehensive problems. 
Let 5 G Eq be strictly comprehensive. We need to show that F[S) = E[S). 
Let E{S) = X and S' = cch{x}.^ E[S') = x by weak Pareto optimality 

^Given A C R”,cc/i{A} denotes the “convex and comprehensive hull” of A: it is the 
smallest convex and comprehensive subset of containing A. li x,y 6 R” we write 
cch{x, y} instead of cch{{x, y}}. 
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and symmetry. Furthermore S D S' . Therefore F{S) > F{S') = x by 
strong monotonicity. Yet x is the only feasible point to satisfy this inequality. 
Therefore F{S) = x = E(S). (See Figure 7). 




Figure 7: Characterization of the egalitarian solution on the betsis of strong 
monotonicity', the case of strictly comprehensive problems 

Next we consider the class of comprehensive problems. Let 5 G Eg. If 
E{S) is Pareto efficient then we can conclude as before. Suppose then that 
E{S) is not Pareto efficient. (Figure 8). We need to show that F{S) = 
E{S) = X. By the previous argument applied to S and S\ we have x < 
£ V‘ Suppose F{S) = z X. Then we construct a strictly compre- 
hensive problem T D S such that E{T) is to the northwest of z. We have 
F{T) = E{T) as r is strictly comprehensive. Hence F{T) = E{T) and F{T) 
is to the northwest of F{S) = z, contradicting strong monotonicity. There- 
fore F{S) = E{S) = X. Q.E.D. 

Note that the axioms of Theorem 1 are independent. If we drop weak 
Pareto optimality, the solution which selects the disagreement point for all 
problems satisfies symmetry and strong monotonicity. If we drop symme- 
try, any monotone path solution satisfies weak Pareto optimality and strong 
monotonicity. If we drop strong monotonicity, the Nash solution (and most 
of the solutions mentioned above) satisfies weak Pareto optimality and sym- 
metry. 

What if we drop the comprehensiveness assumption? Then the egalitar- 
ian solution is not necessarily weakly Pareto optimal. In fact it may suggest 
the worst possible feasible point. 
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Figure 8: Characterization of the egalitarian solution on the basis of strong 
monotonicity: the case of comprehensive but not strictly comprehensive prob- 
lems 

Can we recover weak Pareto optimality in this domain? Not as long as the 
solution is strongly monotonic. On this domain there is no solution satisfying 
weak Pareto optimality and strong monotonicity. To see this, consider the 
example of Figure 9. Suppose F satisfies weak Pareto optimality and strong 
monotonicity. Then F{S) = x by weak Pareto optimality and then F(5") = 
F{S) = X hy strong monotonicity. By the same reasoning F(5") = F(5') = 
y X, 3. contradiction. 




Figure 9: An impossibility: there is no solution satisfying weak Pareto opti- 
mality and strong monotonicity if non-comprehensive problems are permitted 
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One of the weaknesses of the egalitarian solution is that it is not fully 
Pareto optimal. However there is a very natural way to adjust its definition 
so as to recover Pareto optimality. Given x E H”, let x E denote the 
vector obtained from x by writing its coordinates in increasing order. Given 
x.yeJEC , X is lexicographically greater than y if xi > y\ or [5 1 = y\ and 
X 2 = V 2 ] or, more generally, for some Ar E {1, 2, . . . , n — 1}, [xi = yi, . . . , x/c = 
y/c and Xk+i = y/c+i]- Now, given 5 E Dq’ lexicographic egalitarian 
solution outcome of 5, E^{S), is the point of S that is lexicographically 
maximal. (See Figure 10). 





(a) (b) 



Figure 10: The lexicographic egalitarian solution (a) a two agent case (b) 
a three agent case 

Note that the lexicographic egalitarian solution is Pareto optimal even if 
the domain is not comprehensive. 

The next property requires a solution to be immune to positive affine 
transformations of the utility functions. (Recall that von Neumann-Morgenstern 
utilities are unique up to positive affine transformations). 

Let Aq : M” IR^ be the class of independent person by person, 
positive linear transformations: A E Aq if there is fc E IR^^. such that 
for all X E IR*', A(x) = (AriXi, . . . , A?nXn). Given A E Aq and 5 C 
A(5) = {x' E IR” I 3x E 5 with x' = A(x)}. 
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Scale invariance: A(F(5)) = F(\{S)). 

Both the Kalai-Smorodinsky solution and the Nash solution are scale in- 
variant. However the egalitarian solution is not. 

One of the problems with strong monotonicity is that it requires every- 
body to gain even if opportunities expand in a way that really “favors” one 
of the agents. This is the motivation for the following property. It requires 
that if opportunities expand in a direction favorable to an agent, in the sense 
that the maximal utilities of other agents remain the same, he (weakly) gains. 

Individual monotonicity: For all 5, 5' G Eq, if 5' D 5 and = CLj{S) 

for all j ^ i then F,(5') > F,(5). (See Figure 11). 




Figure 11: Individual monotonicity: if the opportunities expand in a direc- 
tion favorable to agent 1 then he (weakly) gains 

The Kalai-Smorodinsky solution is individually monotonic. (See Figure 
12a). The advantage of weakening strong monotonicity is we recover scale 
invariance. However we recover it essentially in a unique way as revealed by 
the following theorem. 

Theorem 2 (Kalai and Smorodinsky, 1975): The Kalai-Smorodinsky solu- 
tion is the only solution satisfying weak Pareto optimality, symmetry, scale 
invariance, and individual monotonicity. 

Proof: It is easy to show that the Kalai-Smorodinsky solution satisfies weak 
Pareto optimality, symmetry, scale invariance, and individual monotonicity. 
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Conversely let F satisfy the four axioms. Let 5 G Sq such that K{S) is on 
the 45® line. This is without loss of generality as otherwise we can transform 
5 by a positive affine transformation and use scale invariance. We need to 
show that F{S) = K(S). Let 5' = cc/i{(ai(5), 0), (5), (0, 02(5)}. Note 
that oi(5) = 02(5) as K{S) is on the 45® line. Hence 5' is a symmetric 
problem. Therefore F(5') = K{S) by weak Pareto optimality and symmetry. 
But 5 D 5',oi(5) = oi(5'), and 02 ( 5 ) = 02 ( 5 '). Therefore using individual 
monotonicity twice we have Fi(5) > Fi{S') = Ki(S) and ^2(5) > F 2 {S') = 
A'2(5). But A" (5) is the only feasible point to satisfy these two inequalities 
and hence F{S) = A(5). (See Figure 12b). Q.E.D. 



0 



I 

(a) (b) 

Figure 12: (a) The Kalai-Smorodinsky solution is individually monotonic 
(b) a characterization of the Kalai-Smorodinsky solution on the basis of in- 
dividual monotonicity 

Note that the Kalai-Smorodinsky solution also satisfies Pareto optimality. 

An alternative monotonicity condition, restricted monotonicity ^ says that 
if opportunities expand but the maximal utilities remain the same, then all 
agents should (weakly) gain. 

Restricted monotonicity: For all 5,5' G Eq if *5' D 5 and a(5') = a(5) 
then F(5') > F(S). (See Figure 13). 

We may replace individual monotonicity with restricted monotonicity in 
Theorem 2 and obtain an alternative characterization of the Kalai-Smorodinsky 
solution. 
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Figure 13: Restricted monotonicity: if opportunities expand without favor- 
ing anybody then everybody gains 

One difficulty in extending the Kalai-Smorodinsky solution to classes of 
not necessarily comprehensive problems for more than two person problems is 
the following: The Kalai-Smorodinsky outcome may fail to satisfy Pareto op- 
timality', in fact it too may even be dominated by all feasible points. (See Fig- 
ure 14). However once comprehensiveness is imposed, the Kalai-Smorodinsky 
solution satisfies weak Pareto optimality. 




Figure 14: A difficulty in extending the Kalai-Smorodinsky solution for not 
necessarily comprehensive and more than two person problems: the Kalai- 
Smorodinsky outcome is dominated by all other points 
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Next consider the following independence property. It is in the same spirit 
as Arrow’s independence of irrelevant alternatives condition. 

Contraction Independence: For all 5,5' E 'Eq i^ S' C S and F{S) G 5' 
then F(5') = F{S), (See Figure 15). 




Figure 15: Contraction independence: if opportunities shrink leaving the 
solution outcome feasible, it should still be selected 

The Nash solution satisfies contraction independence. (See Figure 16a). It is 
also Pareto optimal (even in the case of non-comprehensive problems), sym- 
metric, and scale invariant. In fact, we have: 

Theorem 3 (Nash, 1950): The Nash solution is the only solution satisfying 
Pareto optimality, symmetry, scale invariance, and contraction independence. 

Proof: It is easy to show that the Nash solution satisfies Pareto optimality, 
symmetry, scale invariance, and contraction independence. 

Conversely let F satisfy the four axioms. Let 5 G Sq such that N{S) 
is on the 45® line. This is without loss of generality as otherwise we can 
transform 5 by a positive affine transformation and use scale invariance. We 
need to show that F(S) = N{S). Let T = {y € 1R+ | T,Vi < 

The problem T is symmetric and N{S) € PO{T). Thus F(T) = N{S). But 
T D 5 and F(T) = N{S) e S, therefore F(S) = F(T) = N{S) by contraction 
independence. (See Figure 16b). Q.E.D. 
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Figure 16: (a) The Nash solution satisfies contraction independence (b) a 
characterization of the Nash solution based on contraction independence 



The characterization of the Nash solution easily extends to more than two 
person problems. 
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1 Variable Number of Agents 



The model discussed in Part I is written under the assumption of a fixed num- 
ber of agents. Here, the number of agents is allowed to vary. How solutions 
should respond to such changes are formulated as axioms, and additional 
characterizations of the main solutions are developed. A detailed account of 
these recent developments can be found in Thomson and Lensberg (1989). 

The model is expanded to allow problems involving any finite number of 
agents. Agents are indexed by the integers. Let V be the set of all finite 
subsets of the integers. Given P £ V, the set of problems that the group 
P could face is denoted by Df . The set of all problems is denoted by So = 
Up^v^o ' A solution is a function F : So~> such that for all 

P £ V and all 5 G Sf , F(S) £ S. Figure 1 represents a 2-agent problem 
5 G Sq involving agent 2 and agent 3. 




Figure 1: A 2- agent problem involving agent 2 and agent 3 



In this more general model it is possible to define solutions by “combining” 
different solutions across cardinalities: say, the egalitarian solution could be 
used for 2-agent bargaining problems; the Kalai-Smorodinsky solution for 3- 
agent problems; the egalitarian solution again for 4-agent problems; and — 
One may find it more reasonable using the same solution for all sizes of groups. 
But even if we decide to do that, how do we compare the solution which uses 
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the Kalai-Smorodinsky solution for all cardinalities and the solution which 
uses the Egalitarian solution for all cardinalities. We need to develop a 
theory to help us link the components of solutions across cardinalities. In 
what follows we propose two principles that can be used for that purpose. 



2 Population Monotonicity and the 
Egalitcirian Solution 



Consider a group of agents P £V that have to divide a bundle of goods on 
which they have equal rights. They do this by applying the component of 
a solution pertaining to the group P. Then new agents come in, who are 
recognized to have equally valid rights on the goods. This requires that the 
goods be redivided by applying the component of this solution pertaining to 
the remaining group. Population monotonicity says that none of the agents 
initially present should gain. Conversely, and equivalently, if some agents 
relinquish their rights, all remaining agents should be better-off. In bargain- 
ing theory population monotonicity says that the projection of the solution 
outcome of the problem involving the larger group onto the subspace rela- 
tive to the smaller group is Pareto-dominated by the solution outcome of the 
intersection of the larger problem with that subspace. (Figure 2). 




Figure 2: Population monotonicity in bargaining theory 
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Population monotonicity; For all P,Q with P C Q, if 5 E Eq and 
T G E J are such that 5 = Tp, then F(S) > Fp(T)} 



The egalitarian solution (Figure 3) and the Kalai-Smorodinsky solution 
satisfy this requirement; but the Nash solution does not. Characterizations of 
the egalitarian solution and the Kalai-Smorodinsky solution can be obtained 
with the help of population monotonicity. 




Figure 3: The egalitarian solution is population monotonic. the projection 
of the egalitarian outcome of the problem T onto the coordinate subspace 
pertaining to agents 1 and 2 is dominated by the egalitarian outcome of 
the problem S which is the intersection of T with the coordinate subspace 
pertaining to agents 1 and 2 

Theorem 1 (Thomson 1983a): The egalitarian solution is the only solu- 
tion on Eq satisfying weak Pareto optimality, symmetry, contraction inde- 
pendence, continuity, and population monotonicity. 

Proof (Sketch) : It is easy to verify that E satisfies the five axioms (see Figure 3 
for population monotonicity). Conversely, let F be a solution on Eq satisfying 
the five axioms. We only show that F coincides with E on Eq (Figure 4). 

^Note that Tp is the intersection of the problem T with the subspace and Pp(T) 
is the projection of the outcome F(T) onto the subspace R^. 
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For simplicity, let S G be given. Suppose that E{S) = {a, a). Let 

T G be such that T = {ar G |xi + X2 X3 < 3a}. By weak 

Pareto optimality and symmetry, F{T) = (a, a, a). Now, let T' G 
be such that T' = cch{S, F(T)}. Since T' C T and F{T) G T', it follows 
from contraction independence that F(T') = F{T). Suppose S C 2 }- 
Then 2 } *5) so that by population monotonicity, F{S) > E{S). If 

E[S) G PO(5), then we are done. Otherwise we conclude by continuity. Next 
suppose 5 2 ^{ 1 , 2 }* In this case the previous arguments can be modified by 
adding more than one agent and obtaining the inclusion. Q.E.D. 




Figure 4: Characterization of the egalitarian solution on the basis of popu- 
lation monotonicity 
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3 Population Monotonicity and the 
Kalai-Smorodinsky Solution 



All of the axioms used in the next theorem except anonymity have already 
been discussed. This property is a strengthening of symmetry and it requires 
the solution to be invariant under exchanges of the names of the agents in 
each group, as well as the replacement of some of its members by other agents. 
(See Figure 5). 

Anonymity: Given P, P' E P with |P| = |P'|, 5 G ^ j 

if there exists a bijection 7 : P — >• P' such that 5' = {x' G 1 3x G 
5 with xj- = x^(,) for all i G P}, then P.y(,)(5') = Fi{S). 





Figure 5: Anonymity with a variable number of agents 



The following theorem differs from the previous one only in that scale 
invariance is used instead of contraction independence. 

Theorem 2 (Thomson 1983b): The Kalai-Smorodinsky solution is the only 
solution on Eo satisfying weak Pareto optimality, anonymity, scale invari- 
ance, continuity, and population monotonicity. 

Proo/ (Sketch): It is straightforward to see that K satisfies the five axioms. 
Conversely, let P be a solution on Eo satisfying the five axioms. We only 
show that F coincides with K on Eq (Figure 6). Let S G be given. By 

scale invariance, we can assume that S is normalized so that a (5) has equal 
coordinates. Suppose then that K(S) = (u,u) and let x = (a, a, a). We 
construct T G by replicating S in the coordinate subspaces 

and and taking the comprehensive hull of 5, its two replicas, and 

of the point x. Since all agents enter symmetrically in the definition of T 
and X G PO(T), it follows from anonymity and weak Pareto optimality that 
F{T) = X. Now, note that T{i, 2 } = S, so that by population monotonicity. 
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F{S) = K(S). To prove that F and K coincide for problems of cardinality 
greater than 2, one has to introduce more agents and continuity becomes 
necessary. Q.E.D. 







Figure 6: Characterization of the Kalai-Smorodinsky solution on the basis 
of population monotonicity 



4 Consistency and the Nash Solution 



Instead of allowing the agents to depart empty-handed, we will now imagine 
them to leave with their payoffs. A solution is consistent if the remaining 
agents receive the same payoffs when some of the agents depart with their 
payoffs. To be precise, let Q £ V and T G and consider some point 
X G T as the candidade compromise for T. Assume that it has been accepted 
by the subgroup P', and let us imagine its members leaving the scene with 
the understanding that they will indeed receive their payoffs xp>. Now, let 
us reevaluate the situation from the viewpoint of the group P = Q \ P' of 
remaining agents. It is natural to think as the set {y G | (y, xp/) G T} 
obtained from points of T by giving the agents in P' the payoffs xp/ , as the 
feasible set for P. Let us denote it by r|,(T). Geometrically, rp(T) is the 
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“slice” of T through x by a plane parallel to the coordinate subspace relative 
to group P. If this set is a well-defined member of Ef , and if the solution 
recommends the utilities Xp, then it is consistent. (Figure 7). 

Consistency; Given P,Q eV with P C Q, if 5 € Ef and T € E^ are such 
that X = F{T) and rf>(T) = 5, then F{S) = xp. 




(a) (b) 



Figure 7: (a) Consistency in exchange economy: if agent 3 leaves with his 
consumption Z 3 the remaining agents should be unaffected, (b) consistency 
in bargaining theory: if agent 3 leaves with his payoff P 3 (T), the remaining 
agents should be unaffected 

Consistency is satisfied by the Nash solution (Harsanyi, 1959) but not by 
the Kalai-Smorodinsky solution, nor by the egalitarian solution. Violations 
are usual for the Kalai-Smorodinsky solution but rare for the egalitarian so- 
lution; indeed on the class of strictly comprehensive problems, the egalitarian 
solution does satisfy the condition, and if this restriction is not imposed, it 
still satisfies the slightly weaker condition obtained by requiring F{S) >xp 
instead of F[S) = xp. Let us call this weaker condition weak consistency. 
The lexicographic egalitarian solution satisfies consistency. 

The Nash solution can be characterized on the basis of consistency: 

Theorem 3 (Lensberg, 1988): The Nash solution is the only solution on 
Eo satisfying Pareto optimality, anonymity, scale invariance, continuity, and 
consistency. 
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Proo/ (Sketch): It is straightforward to see that N satisfies the five axioms. 
Conversely, let F be a solution on Eq satisfying the five axioms. We only 
show that F coincides with N on Eq. Let S G be given. By scale 

invariance, we can assume that S is normalized so that N{S) = (1,1). 
For simplicity, we assume that PO{S) D [(1.5, .5), (.5, 1.5)]. (In Figure 8, 
S = cc/i{(2,0), (.5, 1.5)}.) Now, we translate S by the third unit vector, we 
replicate the resulting set twice by having agents 2, 3 and 1, and then agents 
3, 1 and 2 play the roles of agents 1, 2 and 3 respectively; finally, we define 
T G Eq ’ Mo be the convex and comprehensive hull of the three sets so 
obtained. Since the problem T = cc/i{(l, 2, 0), (0, 1,2), (2,0, 1)} is invariant 
under rotations of the agents, by anonymity, F{T) has equal coordinates, and 
by Pareto optimality, F{T) = (1, 1, 1). But, since rp’^’^^(T) = 5, consistency 
gives F(S) = (l,l) = iV(5). 




For the case that PO{S) 2 [(1*5) -5), (-5, 1.5)] and N[S) is contained in 
the interior of PO{S), the argument can be extended in the same way by 
introducing more agents. For the remaining case, the argument is completed 
by using continuity. Q.E.D. 




33 



5 Conclusion 



In the previous chapter we presented the classical results in bargaining the- 
ory and in this chapter we presented recent developments concerning variable 
population problems. These chapters are in no means comprehensive. There 
has been considerable expansion in bargaining theory in recent years. Some 
of these recent developments include studies concerning changes in the dis- 
agreement point, uncertainty in the feasible set or the disagreement point (or 
both), bargaining problems with claims, non-convex problems, etc. We refer 
the reader to Thomson (1994) for an analysis of recent developments. 
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1. Introduction 

Pure bargaining games discussed in the previous two lectures are a special 
case of n-person cooperative games. In the general setup coalitions other than 
the grand coalition matter as well. The primitive is the coalitional form (or, 
"coalitional function”, or "characteristic form"). The primitive can represent 
many different things, e.g., a simple voting game where we associate to a 
winning coalition the worth 1 and to a losing coalition the worth 0, or an 
economic market that generates a cooperative game. Von Neumann and 
Morgenstem (1944) suggested that one should look at what a coalition can 
guarantee (a kind of a constant-sum game between a coalition and its 
complement); however, that might not always be appropriate. Shapley and 
Shubik introduced the notion of a C~game (see Shubik (1982)): it is a game 
where there is no doubt on how to define the worth of a coalition. This happens, 
for example, in exchange economies where a coalition can reallocate its own 
resources, independent of what the complement does. 

We assume we are given a coalitional function. Let N denote the set of 
players; a subset SciN is called a coalition^ V(S) is the set of feasible outcomes 
for S'. 

How is an outcome defined? Assuming that some underlying utility 
functions for the players are specified, one can represent outcomes by the 
players' utilities. We thus use a payoff vector cc=(cf)i^g in 91*^ to represent an 
outcome, where is player /'th utility of the outcome. So F(S)ci9l‘^. Usually 
there are some assumptions made on the set V{S)\ e.g., comprehensive, closed, 
convex, etc. 

There are two special classes of games: 

1) Pure bargaining games (PB): In these games only the grand coalition 
matters. Here F(5)={xg^9l‘^ such that for all ieS} for all^ Si^N. 



^ Lecture notes written by Yossi Feinberg. 

^ Sometimes this is relaxed to: (0,...,0)ebdF(5) for all where "bd" stands for 
boundary. 
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2) Transferable utility games (TU): (used to be called "games with side 
payments"). Here one number represents what a coalition can get, and the 
members of the coalition can arbitrarily divide this amount ^ong themselves. 
So a TU game is of the form F(iS)={xe^91‘^ such that where v(5) 

is a number for all S(zN. Geometrically these sets are half-spaces with normal 
vector 

The general games are usually referred as games with non-transferable 
utility, or NTU-games for short. The following diagram shows the relationship 
between the different classes of games. 




2. Solution Concepts 

We will distinguish between two approaches to solution concepts (though 
the distinction is not always clear cut): 

D - definition, description, discussion. 

A - axiomatization. 

Obviously there are other approaches, e.g., noncooperative, evolutionary, etc. 

The D-approach stands for various formal or informal arguments, on how 
the solution has to look like. In the A-approach, one puts down a set of axioms 
and gets as a result that these axioms imiquely characterize the solution 
concept. Most solution concepts started out with the D-approach and only later 
where axiomatized; the Shapley Value started out with the A-approach. 

A p.v. (payoff vector) is a vector xe^^. It is feasible if xeV(N), efficient 
(or Pareto optimal) if xebdF(A0, and individually rational if x^^^intF({/}). The 
set X\={x\ X is an efficient and individually rational p.v.} is called the set of 
imputations. For simplicity we assume that file set of imputations is always non- 
empty. Thus in the TU case we consider only games which satisfy 
v(N)>'Zi^N'^(l) • A solution concept associates payoff vectors (outcomes) with 
each game. 
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3. The D-Approach 
3.1 Core 

The idea of the Core is to look at those payoff vectors which no coalition 
can improve upon. Let x be a feasible p.v.; a coalition (d^S<zN can improve 
upon X if there is a feasible outcome y for S (ye V(S)) which is better for S, that 
is y > for all / in S (everyone in .S' must agree that y is better than x); we 
write y dominates x if there exists a coalition S such that 

y >S ^ defined as the set of imputations that are not 

dominated by any p.v. . 

In the NTU case a feasible p.v. x satisfies: jceCore <=> x^^int V(S) for alP 

& 

In the TU case a feasible p.v. x satisfies: jceCore <=> x(S)>v(S) for all"^ S. 

A first question one poses is the non-emptiness of the Core. This is 
connected with superadditivity which states that joining two coalitions may only 
increase their possibilities. The non-emptiness of the core is implied by 
balancedness, which is a generalization of superadditivity. Certain classes of 
games such as market games turn out to have a non-empty core (under general 
conditions). In the case of market games, the competitive (Walrasian) 
equilibrium is always in the Core. On the other hand, unless there is a veto 
player, voting games have an empty Core. 



3.2 von Neumann and Morgenstern Stable Set 

Recall that the Core is the set of all feasible p.v. that are not dominated by 
any p.v. . 

Consider now the following definition of a solution: 

The "Shmore”^ is the set of all feasible p.v. that are not dominated by any 
p.v. in the Shmore. 

It turns out this concept is indeed well defined. The idea behind it is that 
the set of "good” p.v. is to be compared against "good” p.v. . The definition of 
the Shmore can be rewritten by jce Shmore <^y>/-x for all y'E Shmore. This can 
be further restated as follows: Let ^=Shmore, then 

\)x,yeK^x>l-y; 

2) y^K => there exists xeK such that x)^y. 



^ We write for the projection of jc on 91*^, i.e., ^=(^)/e5 • 
^ Here we define Jc(iS)= jc^ . 

^ This (temporary) name is due to R. J. Aumann. 
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A set K of efficient p.v. that satisfies these two conditions is called a von 
Neumann and Morgenstern Stable Set. Note that the basic concept here is a set 
concept, unlike the Core where the payoff vector’s properties alone determine if 
it is in the solution. The Stable Set becomes a "standard of behavior" in the 
sense that if everyone believes that the solution is in K it will indeed be in K. 
Note that there may be more than one stable set in a game. Trivially, all von 
Neumaim and Morgenstern Stable Sets contain the core. 

There are games for which there are no Stable Sets. The first example was 
found by Lucas (1968) and was a 13 player game. Unfortunately even in simple 
cases it is difficult to calculate all (or even one of) the Stable Sets of a game. But 
finding this solution is very rewarding, it gives a lot of insight. For example, in 
voting games, where minimal winning coalitions seem important, it turns out 
that Stable Sets predict actually the formation of minimal blocking coalitions. 



3.3 Bargaining Sets 

The previous solution concepts are based on the idea that, given a p.v. , 
some coalitions or players may object to it (using other feasible p.v.). Along this 
line one can define a counter objection by objecting to the p.v. used for the 
original objection. Then follows the notion of justified objections, defined as 
objections that have no coimter objection. Using these definitions one can define 
the Bargaining Set as the set of efficient p.v. for which there is no justified 
objection. This solution has many variants and was first conceived by Aiunann 
and Maschler (1964); see Davis and Maschler (1963,1967), and also Mas-Colell 
(1989) for a new approach. The work on Bargaining Sets led to the following 
solution concept. 



3.4 Nucleolus 

This is a one point solution defined for the TU case. There are various 
suggestions for the generalization of the Nucleolus to the NTU case, but this is 
not yet settled. 

Behind the notion of the Nucleolus is the following interpretation. Given a 
p.v. X each coalition S looks at v{S)-x(S)\ this number represents the "complaint" 
of the coalition (it could be positive or negative). The higher the complaint the 
more loudly the coalition protests against x. Thus we want to minimize 
complaints under the "budget constraint" (the feasibility of x). We do so starting 
with the maximal complaint, i.e., we look at Minp^ ^ {Max^^^ (v(S) - x(S ))} . 
Then we minimize the next highest complaint when considering only p.v. 
which minimized the highest complaint, and so on. What we get is the 
lexicographic minimum of all complaints. It turns out that we are left with a 
unique p.v. which is the Nucleolus. This solution concept is due to Schmeidler 
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(1969). The Nucleolus was applied to various problems, such as an airport 
landing fees problem in which airlines needed to share the cost of using 
runways (each coalition of airlines needing its distinct minimal runway length). 
We remark that when the Core is non-empty there is a feasible p.v. for which all 
complaints are non positive. Thus, in this case the complaints for the Nucleolus 
are non positive as well and we have that the Nucleolus is in the Core (it is 
moreover a special point in the Core, a kind of a symmetry center). 

We have presented three kinds of solution concepts: one is a one-point 
solution (the Nucleolus); the second is a set of points (the Core), and the third is 
several sets of points (the Stable Sets). 



4. The A-Approach 

We move now to the second point of view on solution concepts, i.e., the 
axiomatic approach. One can always use the definition of a concept as its 
axiomatization, but obviously we would like to have more basic axioms with an 
intuitive meaning that characterize our concept. It should be noted that these 
characterizations are comparatively new. 

All these axiomatizations have in common the Consistency axiom (also 
called the Reduced Game Property). Consistency is based on the following idea: 
Assume we have a game and its solution. Suppose that a certain set of players 
agree to the solution. The reduced game is the game played by the remaining 
players, on the remaining payoffs. Consistency requires that the solution of the 
reduced game be identical to the solution of the original game. 

Formally, let (iV,F) be an NTU game. Let x be a p.v. and TaN be a 
coalition. We define the game (T,F*) (where F* depends on both x and T) by^ 
^{T)={yT’\(yT,x^)eV{N)y, i.e., we give x^" to the players in 7^ and consider 
what we can give to the players in T so that the whole vector is feasible in the 
original game. For strict sub-coalitions SciT {S^T) we define 
V^(S)=^Q^^c{}^\()^,xQ)eV(SKjQ)y, that is, we consider all sub-coalitions Q of 
7^ as those which can complement the members of S and create a feasible p.v. 
for S through a feasible p.v. for SuQ in the original game. 

The consistency or reduced game property states: 

CONS: Ifxis a solution of (A/^, 10 then is a solution of (T,F*) for all T. 

It turns out that this definition of consistency yields many results. 



^ 7^=M7’ is the complement of T. 
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4.1 The TU Case 

A solution associates a set of feasible p.v. for each game, i.e., it is a mapping 
(N,v)^o(N,v) where o(N,v) is a set of feasible p.v. . We shall consider the set of 
games with non-empty Core. The axioms are: 

NE (non-emptiness): a(N,v)^Qi . 

IR (individual rationality): In any p.v. in the solution every player gets at least 
what the game guarantees him, i.e., xea(N,v) implies for all /. 

CONS : (as above). 

SUPA (superadditivity): o(W,v)+o(W,w)c:o(W,v+>i') where the summation here is 
a set summation. Note that the set of players is always the same. 

SIVA (single valuedness): |cj(iV,v)|=l . 

AN (anonymity): If the games (N,v) and (Ny) are isomorphic, i.e., there exists 
a one-to-one mapping I1:N-^N such that v(S)=v'(IlS) for all S, then 
Ilcy(N,v)=a(Ny) (II is a "relabeling of the players"). 

INV (TU invariance): For all a>0 and if w(S)=av(S)+'Z,^b’ for all S 

then a(N,w)=aa(N,v)+b . 

Theorem (Peleg (1986)): The Core is the unique solution concept satisfying 
NE,IR,CONS,SUPA. 

(Note that this result requires a world with 3 players at least.) 

Theorem (Sobolev (1975)): The PreNucleolus (defined in the same way as 
the Nucleolus, with respect to all efScient but not necessarily individually 
rational p.v. x) is the unique solution concept satisfying SIVA,AN,INV,CONS. 
(This result requires a world with an infinite niunber of players.) 

The axiomatization of the Stable Sets is an open problem. 



4.2 The NTU Case 

Theorem (Peleg (1985)): The Core is the unique solution concept satisfying 
NE,IR,CONS (under some regularity conditions on the games considered). 

(Note that this result requires a world with an infinite number of players.) 
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!• Introduction 

The Value is a solution concept originally due to Shapley (1953). The idea 
behind the concept is to evaluate how much will a player be willing to pay to 
participate in a given game. It seeks to represent what the game is worth for a 
player. In some sense the value captures the expected outcome of the game. We 
will start with the TU (transferable utility) framework and the axiomatic 
approach and then consider various extensions to the NTU (non transferable 
utility) case. 



2. TU Games 

A TU game (N,v) is defined by associating a real number v(S) to every 
coalition S<zN (put v(0)=O). v(S) is referred to as the worth of the coalition S, 
i.e., what the members of the coalition S can divide between them. The value 
will associate one p.v. (payoff vector) with each game; i.e., the value of a game 
will be denoted (p(A^,v)e9l^ where cp^(Mv) is the value of player / in the game 
Wv). 

Shapley (1953) started out with the following set of axioms. 

EFF (Efficiency) (also called PO-Pareto optimality): 
cp(W,v) satisfies ^\N,v)=v{N). 

ET (Equal Treatment) (usually called symmetry): 

Two players ijeN are called substitutes in {Ny) if v(SU{/})=v(5U{/’}) for all 
coalitions S such ffiat ij^S, The axiom states that if ij are substitutes in the 
game (iV,v) then (p^(V,v)=(p/(V,v). 



^ Lecture notes written by Yossi Feinberg. 
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NP (Null Player) (or Dummy): 

A player / is a null player in (A^,v) if v(5U{/})=v(iS) for all S. The axiom states 
that if / is a null player in (A^,v) then^ cp^(A^,v)=0. 

ADD (Additivity): 

For all players i in N, (p^(Ny)+(p^(N,w)=Hp\N,v^w), where the game (A^,v+w) is 
defined by (v+w)(iS)=v(iS)+w(S) for all coalitions^ S. 

Theorem (Shapley (1953)): 

a) There exists a unique cp satisfying EFF, ET, NP, ADD on the class of all TU 
games. 

b) cp is given by cp'(A^,v)=£:{j|,g 5 , [v(5U{/})-v(^]. 

The formula of cp given in b) can be explained as follows. We look at 
v(iS^^{/})-v(iS') which is the marginal contribution of player / to the coalition 
5, and average the marginal contributions according to a distribution over S. 
This distribution can be defined by looking at the players in a random order, 
where the players preceding player / in a given order form the coalition S. One 
may think of the players as entering a room one by one in a random order, and 
averaging the contribution of player /, when he enters the room, to the players 
already in the room. 

There is some distant indication of the idea of marginality in the NP (Null 
Player) axiom. There, a player which always contributes marginally zero gets 
zero value, but this is still very far from the marginal contributions that appear 
in the formula. It is easy to see that the four axioms are satisfied by 
(p^ (N,v) = [v(5U{/})-v(5)] (to see that EFF is satisfied notice that if we 

take the marginal contributions of all players for a given order the terms in the 
summation cancel each other out and we are left with v(iV) which is averaged 
over all orders). We now sketch the proof in the other direction, i.e., that the 
axioms imply the Shapley value. Consider a unanimity game Uf for a given set 
of players T, i.e., Uf is defined so that all coalitions S that contain T have 
w j<5)=l and all the other coalitions get 0. By NP, players outside T must get 0, 
and by EFF and ET all players in T must get 1/#T. These imanimity games form 
a basis of the linear space of games. Thus by"^ ADD we get that the solution has 
to be the Shapley value since both are linear and agree on the basis of unanimity 
games. 

As was mentioned above, the underlying notion here is "marginality”, i.e., 
the value of a player is only a function of his marginal contributions. Thus we 
introduce a new axiom. 



^ In the original paper by Shapley this axiom was combined with the EFF axiom. 

^ Sometimes a different version of this axiom is used. It corresponds to playing the 
average game, and it requires that l/2(p^(A^,v)+l/2(p'(7^,w)=(p^(A^,(v+>v)/2) . 

^ Note that the value of cuj is c times the value uj for all constants c, again by NP, 
EFF and ET. 
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MARG (\farginality): 

Let {N,v) and (iV,w) be two games (with the same set of players N). Let ie^N. If 
v(5L<{/})-v(iS)=H'(iSLi{j})-w(«S) for all S such that /g5', then (f\N,v)=<f\N,w). 

It turns out that ADD which is considered as a strong axiom and NP can be 
replaced by MARG. 

Theorem (Young (1985)); 

The Shapley value is the unique solution concept satisfying EFF, ET, MARG. 

One can obtain the Shapley value through yet another approach. Consider 
what player j contributes to Ae value of player /, i.e., by how much the value of 
i changes when j "drops out": cp'(iV,v)Kp*(M{/},v), (in the second term we 
consider the subgame of v with player set M{/}). It turns out that the Shapley 
value satisfies that /th contribution to /'th value always equals /'th contribution 
to /th value, i.e., 9 '(^,v)- 9 '(M{/},v)=(p/(A/,v)-cp^(M{/},v). Moreover these 
equations for all ij together with EFF characterize uniquely the Shapley value 
(Myerson (1980) and Hart and Mas-Collel (1989)). The marginal contributions 
above have a structural resemblance to derivatives and the requirement of equal 
contributions reminds us of the mixed derivatives condition of Frobenius. 
Taking this line of thought even fiuther implies the existence of a "potential 
fimction" whose "gradient vector" is the value p.v. . One defines a re^ valued 
fimction P on games as a potential if it satisfies [mv)-P(M{/},v)]=v(A0 

for all games {N,v). 

Theorem (Hart and Mas-Collel (1989)) 

There exists a unique fimction P satisfying [F(A^,v)-P(M{/},v)]=v(A'); 

moreover, its "derivative" is the Shapley value, i.e., (^\N,v)=P{N,v)-P(hh{i},v). 

In the first part of this lecture where we discussed core-like concepts the 
consistency axiom was shown to play a major role in axiomatization. Using a 
related definition of consistency yields another characterization of the Shapley 
value. 

Theorem (Hart and Mas-Colell (1989)) 

EFF, ET, INV for^ 2 players games and CONS characterize the Shapley value^. 

Here CONS is defined as follows: let crbe a one point solution concept. Let 
{N,v) be a game and T a subcoalition of N, we define the game (7’,v*) by 



^ INV is the axiom of covariance with respect to linear transformation (see part 1 of 
the lecture). 

^ Notice the similarity between this characterization and Sobolev's characterization 
of the Nucleolus; the difference hinges on the definition of CONS. 
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v*(iS)=v(iSU7^-a(iSU7^,v)(7^. Here we assume that all the players in 7^ agree 
that <jis a good solution concept. We look at the subgame and subtract 

what the solution gives to 7^ from what the grand coalition in this game gets. 
The axiom states that the solution of the reduced game (T,v*) is the restriction 
to T of the solution to the game (A^,v). 

Unlike the previous definition of CONS, we use here the solution concept 
itself - for subgames - to define the reduced game. It turns out that in cost 
allocation problems this form of reduced games has a more natural appeal and 
indeed the Shapley value is used in such problems. The notion of value has been 
also extensively applied to voting games (weighted majority games). In such 
games the core is often empty and there may be many stable sets (some are hard 
to find). Clearly, the total number of votes a coalition has is usually different 
from its value. In these games the Shapley value (known as the Shapley-Shubik 
index) is best viewed as the probability that a player is pivotal. For example, if 
we have one big party and many small parties, the value of the large party is 
higher than its share of votes. But with two large parties and many small ones 
the power of the large parties is greatly diminished and is lower than their 
actual share of the votes. These phenomena implied by the value are frequently 
observed in practice. The value is easy to apply and is very tractable, thus it 
becomes a most applied solution concept. Note that the value gives us a kind of 
an expected outcome. Furthermore the value is linear unlike the core and the 
nucleolus which are only piecewise linear. 



3, NTU Games 

An NTU game {Ny) is defined by associating a set F(S)e9l‘^ for every 
coalition S, Consider the following diagram of classes of games (see previous 
chapter). 




If we take the classical solutions: the Shapley value in the TU-case, and the 
Nash bargaining solution in the PB-case, one would like to extend these 
solutions to the whole space of NTU games. 
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The Nash solution to 2-person PB games is characterized by the "equal 
angles" property, i.e., the marginal rate of substitution between the players* 
utilities at the solution point is equal to the ratio of the payoffs. This is shown in 
the following diagram. 




There are two underlying criteria that appear here. Let c be a positive 
number. The first is the Utilitarian "local efficiency'* criterion which means that 
we are on the Pareto frontier and the marginal rates of substitution there are 
precisely c The second is the Egalitarian criterion which requires the payoffs to 
be distributed according to the ratio given by c. These two criteria are jointly 
satisfied at the Nash solution: i.e., they are satisfied for the same c. We would 
like to extend these criteria to the general NTU case. To do so, the first step is to 
define an egalitarian solution relative to a vector k of utility comparison weights 
(where is a strictly positive vector in This solution is called the k- 
egalitarian solution. It tries to capture the idea that the gains from cooperation 
are split equally among the players (hence comparison weights are needed). The 
second step consists in endogenizing the determination of the comparison 
weights k. This is done by demanding that k be such that the A-egalitarian 
solution be also A-utilitarian, i.e., that it maximizes the sum of /L-rescaled 
payoffs. A fixed point theorem asserts that there are weights which will yield 
such a value. This approach is due to Harsanyi (1963). 

A different approach was given by Shapley (1969). He looked at the 
induced TU game (A^,V;^), given a vector k of utility comparison weights: 
v^=A/ox{Z,.^/l'x'|xeK(5)}. 
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The Shapley value of the TU-induced game will be somewhere on the hyper- 
plane and may well be non feasible in the NTU game as is shown in the 
diagram below. 



Shapley value of 
TU game 




If the value of the TU game is feasible (thus the two points in the diagram above 
coincide) we get the Shapley NTU value. Again a fixed point argument ensures 
that the NTU value exists. Here we start with the marginal rate of utility 
substitution and check if it corresponds to the induced TU Shapley value. 

There are other NTU values and some applications of values (e.g., in 
market games), there are also axiomatizations of the Shapley NTU value 
(Aumann(1985)) and of the Harsanyi NTU value (Hart(1985)). 

When comparing the two approaches, i.e., Shapley's vs. Harsanyi's, it seems 
that Shapley's approach considers more the effect of the grand coalition on the 
expanse of smaller subcoalitions, whereas the Harsanyi's approach does the 
opposite. This can be clearly seen in their axiomatizations. 

One should also mention a third NTU value. This is the consistent 
Maschler and Owen (1989,1992) NTU value. The natural extension of CONS to 
NTU games is self contradicting (no solution satisfies it), so they defined an 
"average" reduced game. Thus they have an NTU value with a notion close in 
spirit to the CONS. 

This entire lecture has considered the traditional approach alone. It will be 
seen in a later lecture how these solutions may emerge from the non cooperative 
approach. 
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Abstract. This paper surveys cooperative game theory when players have 
incomplete or asymmetric information, especially when the TU and NTU 
games are derived from economic models. First some results relating bal- 
anced games and markets are summarized, including theorems guaranteeing 
that the core is nonempty. Then the basic pure exchange economy is ex- 
tended to include asymmetric information. The possibilities for such models 
to generate cooperative games are examined. Here the core is emphasized 
as a solution, and criteria are given for its nonemptiness. Finally, an alter- 
native approach is explored based on Harsanyi’s formulation of games with 
incomplete information. 

Keywords. Asymmetric information, incomplete information, core, market 
games, NTU games, TU games 

1 Introduction 

This paper considers the incorporation of uncertainty and information into 
cooperative game theory. In particular, we examine the cooperative games 
that arise from economies with asymmetric information. To simplify, we fo- 
cus on the case of general equilibrium models of perfectly competitive pure 
exchange economies. 

One frequently encounters the opinion that cooperative game theory 
cannot easily be adapted to include informational considerations. In fact, 
economists’ interest in asymmetric information is sometimes cited as an im- 
portant reason for the recent emphasis on noncooperative models of strategic 
behavior, especially in fields such as industrial organization and corporate 
finance. I disagree with this viewpoint — one can put asymmetric information 
into cooperative games, albeit at the expense of certain complications which 
may lead to somewhat surprising results. 

Since we stress cooperative games that are derived from economies with 
asymmetric information, we first digress to present a more general, brief sur- 
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vey of the relationships between cooperative games and perfectly competitive 
exchange economies. After summarizing these results on market games in the 
next section, we proceed to introduce information in the following section. 
Section 4 is devoted to Wilson’s article on the core with asymmetric informa- 
tion. The derivation of cooperative games from economies with asymmetric 
information is examined in Section 5, as preparation for analysis of the core 
and the value in the following two sections. Section 8 concludes by presenting 
an alternative approach based on Harsanyi’s formulation of noncooperative 
games with incomplete information. 



2 Market Games 



To fix notation, let N be the (finite) set of traders in the economy (or players 
in the game), and denote a typical agent by i G iV = {1, . . . ,n} (where n 
is the cardinality of the set N). Suppose that the number of commodities 
present in the economy is the finite positive integer i, and take to be 
the consumption set of each trader i £ N. Traders are specified by initial 
endowment vectors and utility functions where, for each i £ N, ei £ and 
Ui : iR is a continuous (or, more generally, upper semicontinuous) and 

concave function representing the preferences of trader z G A”. By definition, a 
coalition is a nonempty subset of agents; the grand coalition AT is a coalition 
as is any nontrivial collection of agents. Each coalition induces a smaller 
economy containing only those traders who belong to the coalition; such 
subeconomies are called submarkets, and they induce subgames. 

An n-player cooperative game with transferable utility (or TU game) is a 
function v : 2^ M with u(0) = 0, where 2^ denotes the set of all subsets of 
N = {1, . . . , n}. The TU cooperative game induced from the pure exchange 
economy as above, in which each trader z G iV has consumption set 
initial endowment ei G and utility function Ui : 1R\, -> IR which is 

assumed to be upper semicontinuous (so that maxima in the definition below 
are well defined), is given hy v : 2^ M with u(0) =0 and, for all S C N 
with 5 0, v{S) = max { Ui{xi)\xi G iR^ for all z G 5 and - 

[With sufficient monotonicity, the inequality sign can be replaced 
by an equality.] In words, v{S) is the maximum total utility that the players 
in S can achieve by redistributing their own resources; if members of coalition 
S were to pool all of their initial endowments and redistribute these goods, 
so as to maximize the total utility of the entire coalition 5, the resulting sum 
would equal v{S). 

The TU core of the n-person TU game v is defined to be the set of all pay- 
off vectors w = {wi , . . . , Wn) € such that (1) {wi ,. . . , Wn) is feasible (for 
YlieN 2 ind (2) , . . . , Wn) is not blocked by any coalition: 

Ylies — ^(*^) S C N. Feasibility of u; G iR^ says that {wi , . . . , Wn) 
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is an imputation for v. The second property is sometimes described by the 
statement that no coalition can improve upon w. 

Transferable utility games with nonempty cores [note that the core always 
exists] can be characterized by a balancedness condition. If N is any finite 
set, a balanced family B of subsets of is B C 2^ for which there exist 
balancing weights {js}seB 75 ^ 0 (75 ^ such that for each i e N, 
Z^ 53 i 75 ~ Obvious examples of balanced families include N itself (with 
weight 7iv = 1) and B = {n}} (again with each balancing weight 

equal to one). For a nontrivial example, consider the two-player coalitions of 
N = {1,2,3} and set 7s = 1/2 for 5 = {1,2}, 5 = {1,3}, and S = {2,3}. 
A TU game on N is said to be balanced if, for all balanced collections B of 
subsets of N and all collections of associated balancing weights {75}sGiB5 
have YlseB^s^iS) < v{N). [Note that the left-hand side of this inequality 
differs from the operation of taking convex combinations in that we could 
have Esee'Ts > 1-] 

Theorem 1 A finite TU game has a nonempty core if and only if the game 
is balanced. 

This result was discovered independently by Bondareva (1962) and Shap- 
ley (1967). Its proof involves demonstrating that a certain system of linear 
inequalities has a solution precisely when the constraints defining balanced- 
ness are satisfied. 

For an example of a game that fails to be balanced, again let AT = {1, 2, 3}, 
and define (with an obvious abuse of notation) u(123) = v{12) = u(13) = 
u(23) = 1 and v{S) = 0 otherwise. Then v is not balanced, since ex- 
amination of the balanced family of two-player coalitions would require 
u(12)/2 -h v(13)/2 -f u(23)/2 < u(123) = 1, an obvious contradiction. In- 
tuitively, we know that the TU core of this game is empty, because any two- 
player coalition that does not include the best treated player(s) can block 
any (feasible) imputation. The game describes a situation (“three men and a 
trunk” ) in which three people discover buried treasure which can be removed 
from the jungle only if at least two individuals carry it. 

The Bondareva-Shapley Theorem is of particular interest because it ap- 
plies to all market games as described above. Moreover, there is an equiva- 
lence between games satisfying a stronger balancedness property and those 
games that can be derived from pure exchange economies satisfying the con- 
ditions stated above. A totally balanced game is one for which every subgame 
is balanced. 

Theorem 2 Every market game derived from a finite pure exchange economy 
in which each trader i has consumption set initial endowment e* G 

, and utility function ui : JR, which is upper semicontinuous and 

concave, is totally balanced. Conversely, every n-player totally balanced TU 
cooperative game can be generated by a pure exchange economy as above with 
i = n. 
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This result was discovered by Shapley and, Shubik (1969) in their clas- 
sic study of market games. The proof in one direction uses the balancing 
weights to define feasible allocations that are convex combinations of allo- 
cations available to smaller coalitions. Concavity of utilities then implies, 
by Jensen’s inequality, that total utility cannot be forced to decrease in the 
larger coalition. For the converse, Shapley and Shubik (1969) construct very 
special economies in which each player’s payoff essentially depends only on 
the player’s allocation of one commodity. [Note that the relations between 
exchange economies and totally balanced games cannot be described by a 
one-to-one correspondence because the space of n-player games can be iden- 
tified with a Euclidean space of dimension 2^ — 1 , whereas the space of n- 
agent exchange economies parameterized by endowments and utilities must 
be infinite-dimensional. More specifically, changing a trader’s utility func- 
tion off of the compact set of feasible allocations cannot alter the TU game 
generated by the economy.] 

Cooperative games with nontransferable utility (or NTU games) can sim- 
ilarly be derived from economies. Of course, NTU games are preferable in 
general for economics, as they do not require one to impose the assump- 
tion that each agent’s preferences are representable by a quasilinear utility 
function in order to justify the addition of payoffs of different agents. [A 
quasilinear utility is a function of the form u(x) -h m, where x can be a vector 
of goods and m denotes the quantity of a commodity — such as money — in 
which side payments are made.] 

Recall that an NTU cooperative game with player set N = {!,..., n} 
is a correspondence V : 2^ IR^ such that, for all S C N, the sets V{S) 
are nonempty, closed, and comprehensive [i.e., V{S) 3 V{S) - iR!f:], and, 
moreover, the V{S) sets are cylinder sets in that if u = (ui, . . . ,Un) G V{S) 
and if u' = (u^, . . . ,u^) is such that Ui = u[ for all i E 5, then u' e V{S). I 
follow the convention that V (0) = Define the projections of the V (S) sets 
into the subspace of payoffs for players in S hy V{S)^ = {u e V{S)\uj = 0 
if j ^ 5}. Note that V{N)j^ = V{N) and V{^)^ = {0}. In addition, for each 
S CN,V{S)g generates the cylinder set V{S). 

A cooperative game V : 2^ IR^ with nontransferable utility is bal- 
anced if, for all balanced collections B on N with associated weights 7 t for 
T £ B, V{N) D [Note that since B = {N} with 7 /^ = 1 

is a balanced collection, taking the union over all balanced collections on 
the right-hand side gives a subset of IR^ which precisely equals V{N).] This 
definition of balancedness is well suited for economies with concave utilities. 
An alternative definition, which Billera (1974) terms “quasibalancedness,” is 
weaker. Say that an NTU game V : 2^ is quasibalanced if, for all 

balanced collections B on A", riT€B^(^) ^ Every balanced game is 
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quasibalanced. As in the case of transferable utility, games with nontransfer- 
able utility are said to be totally (quasi)halanced if all of their subgames are 
(quasi) balanced. 

The core of an NTU game is defined to be the set of feasible imputations 
that cannot be blocked — or improved upon — by any coalition. Formally, u = 
(ui , . . . , Un) € belongs to the core ofV : 2^ if and only if u € F {N) 

and there does not exist a coalition S C N (with 5 7 ^ 0) and a payoff vector 
n' = (u'i,...,u^) G V{S) such that u[ > Ui for all i e S. Note that, by 
definition, the core always exists — every game has a core, although it may 
be empty. Of course, we are interested in games with nonempty cores. One 
rationale for the core as a solution concept is the observation that, although 
not all points in the core may be attractive solutions, whenever the core is 
nonempty we may be justified in eliminating noncore outcomes from further 
consideration. 

Theorem 3 Every quasibalanced NTU game has a nonempty core. 

This result was proved by Scarf (1967). The implication holds in one 
direction only; in contrast to the TU case, one does not have equivalence. 
Moreover, because every balanced game is quasibalanced. Scarf’s Theorem 
implies that every balanced game (as defined above) has a nonempty core. 

Now let us return to our model of an exchange economy and show how 
it generates a well-behaved game with nontransferable utility. As before, we 
permit each coalition to redistribute its own resources provided that every 
coalition member receives an allocation belonging to the consumption set 
Accordingly, define F : 2 ^ -> by F(0) = IR^ and for each nonempty 
S C N, V{S) = {{wi, . . . ,Wn) E M^\ there exists {xi^ . . . ,Xn) E with 
- ZliGS ^ ®^ch i € 5, and Wi < Ui{xi) for all z G 5}. 

By definition, the V{S) are comprehensive cylinder sets. They’re com- 
pactly generated (and, hence, closed as the sum of a closed set and a compact 
set) by the upper semicontinuity of utility functions. This implies that each 
V (5) set is bounded above in all of the coordinates corresponding to play- 
ers in S or, equivalently, that the V{S)s sets are bounded above. Moreover, 
concavity of utility functions implies that each V{S) or V{S)g set is convex. 

Finally, the economic model specified above gives rise to an NTU game 
which is totally balanced. The proof uses convex combinations of feasible 
allocations and concavity of utilities. This implies the following desirable 
property. 

Theorem 4 A finite pure exchange economy, having n agents i = 1 , . . . ,n 
with consumption sets initial endowments Ci G and utilities 

Ui : -> JR which are assumed to be concave and upper semicontinuous, 

generates a totally balanced NTU game so that the game and all of its sub- 
games have nonempty cores. 
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Billera (1974), Billera and Bixby (1974), and Mas-Colell (1975) examine 
whether totally balanced NTU games satisfying the properties mentioned 
above can be generated by economies. The results are less sharp than those 
for the TU case and require technical restrictions which are not discussed 
here. 

An extremely useful reference for much of this material is the book by 
Hildenbrand and Kirman (1976). The Shapley and Shubik (1969) article is 
also accessible. Needless to say, all students interested in cooperative game 
theory should read the following papers relating balancedness to the property 
of having a nonempty core: Bondareva (1962), Shapley (1967), and Scarf 
(1967). 

3 Economies with Asymmetric Information 

This section explains how one can add information to the basic model of a 
pure exchange economy. We are interested in situations in which different 
agents may initially possess different information. Moreover, the information 
must matter to traders. 

To model these phenomena, we begin with an arbitrarily given abstract 
set i? of states of the world. Elements cj of the set i? are assumed to com- 
pletely describe the relevant uncertainty in the universe. A a-field T of mea- 
surable subsets of n is also given. Subsets of i? that belong to !F are also 
termed events. Technically, (i7,.F) is a measurable space. Finally, is 

endowed with a (cr-additive) probability measure /i. [This could be general- 
ized to permit agents to have different subjective probabilities regarding the 
ex ante likelihood of various events in i7, provided that all agree about the 
null events — those which occur with probability zero.] 

The information of trader i £ N is given by a sub-a-field Qi of T. Notice 
that information becomes an ex ante concept, in that it means the capacity 
to condition one’s actions on a particular sub-cr-field, where the agent knows 
which sub-a-field can be used. Thus, information is like an entire random 
variable (or measurable function from i? to iR), rather than a single obser- 
vation of the random variable (or a real number which equals the function 
evaluated at a specific u £ fi). Another analogy is that one should think of 
information as access to an instrument or measuring device, not as a mea- 
surement which is the output of the instrument. In particular, information 
is not equivalent to the fact that a certain state Q has actually occurred. 
Note that asymmetric information is sometimes called differential informa- 
tion, while incomplete information properly refers to situations in which Qi 
is smaller than JF, regardless of whether the Qi may be different for diflFerent 
agents. Symmetric information is a special case of asymmetric information, 
and complete information is a special case of incomplete information. 

A simpler model which captures most of the main ideas starts from a finite 
set i? of states of the world, where each state occurs with strictly positive 
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probability. Agents’ information is specified by partitions of i?. When state 
uj occurs, the agent learns the (unique) element of the partition containing 

UJ. 

States of the world can also be interpreted as signals (about some un- 
derlying fundamental states of the world). However, rather than using dual 
terminology to include this case, I prefer to think of a state of the world as 
an n-tuple of the signals that have been received by each agent. 

The state of the world can affect traders’ endowments and utilities. We for- 
malize this by two measurable functions defined on ft. For each i E N, trader 
z’s initial endowments are given by ei : i? -> which is -measurable. 
The restriction to -measurability (rather than ^-measurability) means that 
trader i must know his or her own initial endowment; the endowment vector 
can depend only on the trader’s own information. If ft is infinite, we assume 
further that each is uniformly bounded almost surely in order to avoid 
technicalities; this condition is automatically satisfied for finite ft. State- 
dependent utilities are frequently written as functions ui : x ft IR 

which are continuous on and .7^-measurable on ft (so that they’re jointly 
measurable). We use JT-measurability instead of ^i-measurability, because we 
envision that traders eventually learn their true utilities upon consumption 
of their allocations, but they may not fully know their state-dependent util- 
ities when they make trades or choose strategies. The essential uncertainty 
here pertains to one’s own preferences. We assume (again, to avoid potential 
difficulties of a technical nature) that for almost all u; E and alH E iV, the 
utility functions Ui{']u) : -> M are not only continuous, but also strictly 

concave and strictly monotone. [For technical reasons (based on the fact that 
proper regular conditional probability distributions are defined only up to null 
sets) state-dependent utilities should be specified by ^-measurable functions 
Ui : ft C{lR^jlR), where the space C(iR^,iR) of continuous functions 
from to M is endowed with the Borel a-field corresponding to the topol- 
ogy of uniform convergence on compact subsets, which makes C{M^,]R) into 
a Frechet space. All conditional expectations are taken with respect to the 
induced image measure on C(iR^,iR), not on the abstract probability space 
(i?,JT, /i). Here one assumes that for alH E A” and for almost all u e ft, 
Ui{uj) is strictly monotone and strictly concave; it may also be convenient or 
necessary to assume that the (unconditional) distribution on C{M^,]R) has 
compact support for all i £ N.] 

An important conceptual problem with asymmetric information models 
is that one must carefully delineate those actions (i.e., trades or strategies) 
among which agents may choose. Radner (1968) considers the question of 
what people can do in a market when they have asymmetric information. 
He proposes that one should be able to verify one’s own (net) trades. For 
example, you will never pay a strictly positive amount to sign a contract with 
me stating that I will give you $100 if I do not have a headache tomorrow 
morning. If you do, I can always tell you that I have a headache, and you 
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can never know that I’m lying. [You also can’t prove to a third party that 
I’m lying, which exemplifies the issue of verification rather than asymmetric 
information.] In competitive equilibrium models, agents trade impersonally 
with the market, which means that one’s own net trade should depend only on 
information available to the agent at the time the market meets. Hence, the 
individual excess demand of agent i should be ^i-measurable, which implies 
(because ei is ^i-measurable) that Vs allocation is also Ci-measurable. Radner 
(1968) demonstrates that, in such models in which consumers have different 
consumption sets because of the restrictions to subspaces of -measurable 
functions from i? to competitive equilibria exist, provided that i? or all 
of the Qi are finite. 

However, the appropriate informational restrictions in cooperative games 
are less clear-cut. Do agents share their information freely within a coalition, 
or can coalitions only make binding agreements based on information which 
is common to all members? The same problem arises when one attempts 
to define Pareto optimality in asymmetric information models. [A different 
approach involving interim efficiency is explored by Holmstrom and Myerson 
(1983) and more recently by Forges (1990, 1991).] 



4 Wilson’s Article 

In a seminal article, Wilson (1978) examines the core of an economy with 
asymmetric information. He focuses on the need to define the information of 
players in a coalition when they (initially) have access to different informa- 
tion. 

The analysis is performed in a pure exchange environment with finitely 
many states. Initial endowments are assumed to be always measurable for 
every agent in every coalition. Wilson (1978) first defines the abstract concept 
of communication structures and then focuses on two special extreme cases: 
the coarse core, defined by the condition that the information for all players in 
coalition 5 is precisely the sub-a-field Aie5 information that they have 
in common, and the fine core, defined by giving every member of coalition S 
the sub-cr-field Vies pooled information, so that the coalition can use 
any information that was initially available to any of its members. 

Wilson (1978) then examines whether these two cores are nonempty. The 
intuition is as follows: The use of only common information renders block- 
ing difficult, so that the coarse core is expected to be nonempty. However, 
blocking is easy with pooled information, so that the fine core may be empty. 
To prove nonemptiness of the coarse core, Wilson (1978) argues that a game 
he defines, in which each “player” consists of a state-player pair, is balanced. 
For the fine core, he provides a counterexample with three states and three 
players in which any feasible, efficient allocation for the grand coalition can 
be blocked in some state by some coalition. 
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However, when one contemplates these results in light of the market 
games literature and the relationships between balanced games and those 
with nonempty cores, the intuition about easy versus difficult blocking seems 
problematic. Potential blocking allocations should be compared to the set of 
allocations available to the grand coalition. As in the earlier market games 
literature, one examines convex combinations (with balancing weights) of 
feasible allocations. Concavity guarantees that the utilities of convex combi- 
nations dominate the average utilities of original allocations. 

For the coarse core, this argument seems to fail. Convex combinations of 
f\ies -measurable and AieT -measurable functions need not be 
Ai 65 uT However, the paradox is solved when one notices 

that Wilson’s (1978) model uses Aie5 information for 

subcoalitions S C N with S ^ N and reverses this logic to take Vieiv 
the information for the grand coalition N . This explains the apparent 
inconsistency of the two strategies for proving that the core is nonempty. 

In contrast, the argument that market games are balanced seems to apply 
to the fine core, as all convex combinations are measurable with respect to 
the information Vieiv grand coalition. Yet, detailed examination 

of Wilson’s (1978) counterexample indicates that the blocking he employs to 
show that the core is empty must occur ex post. In Wilson’s (1978) argument, 
some coalition blocks a given feasible allocation by dominating it in some 
particular state of the world. This would be consistent with a parallel state- 
by-state definition of the feasible and efficient allocations, so that in this 
case the economy with asymmetric information essentially reduces to three 
distinct economies in which all agreements and all trades take place after 
agents learn their information about the particular state of the world that 
has occurred. 

Kobayashi (1980) obtains some results extending Wilson’s (1978) coarse 
core using the concept of common knowledge. He also permits the set f2 of 
states of the world to be infinite. 



5 Market Games with Asymmetric Information 

In order to study cooperative solution concepts for economies with asymmet- 
ric information more systematically, one must derive the TU or NTU games 
that are generated by such economies. Standard results from game theory 
then apply, provided that the induced games are well defined and satisfy the 
necessary assumptions. Moreover, failures of certain solution concepts — such 
as the potential emptiness of the core — can be understood in terms of the 
game theoretic hypotheses that are violated as a consequence of asymmetric 
information. My formulation of market games with asymmetric information 
is based on ex ante agreements within coalitions. In particular, blocking can 
occur only before agents learn about the state of the world that has occurred. 
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SO that all payoffs in the resulting games consist of (unconditional) expected 
utilities (of information-conditional allocation functions). An advantage of 
this approach is that it enables agents to engage in risk-sharing trades and to 
write contracts that Pareto dominate those available with ex post agreements 
and ex post blocking. 

Before the market games can be defined, one must specify the information 
available to every agent in every coalition. Let Hf denote the information 
that agent i E S can use as a member of coalition S. Assume that T-Lf is a 
sub-(7-field of T and that M is measurable with respect to Hf 

for all 5 3 i and alH G N. Note that all members of a coalition need not 
be restricted to the same information; Tif ^ T~Lj is permitted. A natural 
assumption is that Hf = Gi for all S = {z} and all i e N. 

Given an economy with asymmetric information as modeled in Section 3 
and given the sub-a-fields Tif for all S C N {S ^ 0) and all z G 5, define the 
induced cooperative game v : 2^ ^ M with transferable utility by v(0) =0 
and v{S) = J^Ui{xi{uj);u) d/i{(jj)\ for all z G 5, : i? 

is -measurable and Ylies ~ S C N with 

5^0. 

Theorem 5 The induced TU game v defined above is well defined. 

To show that the game is well defined requires proving that the maximum 
exists. Doing so gives rise to technical difficulties (to be discussed briefly 
below) whenever i? is finite. 

Similarly, one can define the derived NTU games. Let V : 2^ IR^ be 
defined by F(0) = IR^ and, ioi S C N with 5 0, V{S) = {{wi , . . . ,Wn) G 

1R^\ there exist 'Hf -measurable functions : i? -> with Ylies ~ 
Ci(cc;) a.s. such that W{ < Ui(xi(cj);cj) dp{uj) for all z G 5}. 

Theorem 6 The induced NTU game is well defined. Moreover , for all S C 
the F(5) sets are convex, and they are compactly generated whenever S 0. 

Closedness of the V (5) sets is roughly equivalent to existence of the max- 
ima for TU games; this is difficult when i? fails to be finite. Convexity follows 
from concavity of utilities, while the property of being compactly generated 
comes from the uniform boundedness of initial endowments and upper semi- 
continuity of utilities. 

If i? is infinite, the argument exploits the characterization of weakly and 
strongly compact convex subsets in spaces, especially the theorem of Dun- 
ford and Pettis (1940). [See Dunford and Schwartz (1958) or Rudin (1973) 
for technical background material.] Details appear in Allen (1991a, 1991b, 
1991c). Page (1993) extends these theorems to allow the underlying com- 
modity space to be replaced by an infinite-dimensional space. 
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6 Cores with Asymmetric Information 



In this section, we utilize balancedness conditions on the derived TU and 
NTU games to obtain nonempty cores with asymmetric information. A sum- 
mary of the finite state case appears in Allen (1994), while the basic general 
references are Allen (1991b, 1991c). 

Theorem 7 A sufficient condition for balancedness of the derived TU or 
NTU game is that for all coalitions S C N and all agents i E S, Tif C Ti^ . 
Total balancedness holds if Tif C whenever ieSCTCN.In par- 
ticular, if Hf C whenever i G S, then the core is nonempty, while 
'Hf C HJ whenever i e S C T C N implies that the cores of all submarkets 
are nonempty. 

The proof is based on the observation that sums of functions measur- 
able with respect to different sub-cr-fields are measurable with respect to the 
smallest cr-field generated by all of the sub-a-fields. The second statement 
follows from Bondareva (1962) and Shapley (1967) or Scarf (1967). 

Theorem 7 implies that the fine information [i.e., = <^( 

ever z G 5] core in the sense defined here with ex ante blocking is nonempty. 
It also implies that the private information [i.e., Hf = Qi for all z G 5 and all 
coalitions S N] core is nonempty. It does not apply to the coarse informa- 
tion \Hf = C\jes ^3 i E S C N] core and, in fact, counterexamples are not 
too difficult to find. However, a consequence of the theorem is that Wilson’s 
coarse core [hf = for z G 5 if 5 ^ A” and = ^(Ujeiv^j)] 

necessarily nonempty. Of course, many other specifications for information 
sharing within coalitions are possible, and the theorem provides sufficient 
(but not necessary) conditions for such models to yield nonempty cores. 

Yannelis (1991) shows that exchange economies have private information 
core allocations. Allen (1992) provides a different proof, under somewhat dif- 
ferent assumptions, which follows directly from the market games approach. 
In Allen (1993a), private information sharing is related to a condition, termed 
publicly predictable information, stating that any single agent’s information 
can always be deduced from the pooled information of all other coalition 
members. 



7 Values with Asymmetric Information 



Having derived cooperative games from economies with asymmetric informa- 
tion, one can apply any of the myriad of alternative {TU or NTU) solution 
concepts, provided that the requisite hypotheses are satisfied by the derived 
game. The value is only one of many solution concepts, albeit it is one that 
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has nice properties and has proved to be extremely useful for many problems 
in economics. Thus, it is discussed here for illustrative purposes. 

Theorem 8 The TU Shapley value of the derived game exists and is unique. 
The NTU value exists ifTLf C whenever i e S. 

The qualification for NTU games is needed for monotonicity of the A- 
transfer games. See Allen (1991a) for details, which use results from Aumann 
and Shapley (1974, Appendix A) and Shapley (1969). 

Krasa and Yannelis (1994) show via a direct argument that economies 
with asymmetric information have value allocations. They do not examine 
all of the information sharing possibilities that are covered by the above 
argument. 



8 The Harsanyi Approach 

A different framework for the analysis of information in cooperative game the- 
ory is based on Harsanyi’s (1967-68) formalization of noncooperative games 
with incomplete information. Recall that a noncooperative game is specified 
by a player set N = {1, . . . ,n}, strategy sets Si for each i £ N, and payoff 
functions Ui : Ili^jsfSi M for each i G AT. To capture the notion of incom- 
plete information, Harsanyi (1967-68) replaces the single payoff function for 
each player by payoff functions that are parameterized by a type space. The 
basic idea is that a type for player i is taken to consist of the player’s own pay- 
off function and his or her beliefs about the payoff functions of other players, 
which are distributions (possibly depending on the player’s own type) over 
the product of other players’ type spaces. 

If one contemplates this approach in the context of cooperative theory, 
problems arise even with transferable utility. For instance, when i learns his 
or her own type with certainty, does player i then know the entire game 
V : 2^ M in characteristic function form — in which case, either beliefs 
are inconsistent or there is no asymmetric information — or does player i only 
have some belief about the correct distribution over possible characteristic 
functions? How do players make enforceable agreements within coalitions 
when they believe they’re playing different games — i.e., when their beliefs 
over i;(5) are different? Even if coalition members agree about v, they may 
disagree about how they can actually achieve the maximal worth of their 
coalition. For example, you and your spouse could both believe that you can 
double your household wealth, but you may disagree over whether this can be 
done by buying orange juice futures or by shorting Singapore stocks. What, 
then, is the worth of such a coalition? 

These problems can be interpreted as suggesting the need to include func- 
tions from strategy sets or actions to payoffs as part of the primitive descrip- 
tion of a cooperative game with incomplete information. Under complete or 
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symmetric information, cooperative game theory usually supresses any ex- 
plicit notion of strategies or actions, although implicitly, when we say that 
v{S) is the worth of coalition S', we really mean that the players of coalition S 
together have some (feasible) joint strategy that enables them to earn v{S). 
Under incomplete information, the strategies should be explicitly included in 
our cooperative games. Observe that this issue did not arise for market games 
with asymmetric information because we did write the strategies in the def- 
inition of v{S) and U(S); the strategies were state-dependent net trades (or 
state-dependent allocations) for each player in the coalition. 

A pathbreaking article by Myerson (1984) explores such a formulation of 
cooperative games with incomplete information. For each coalition 5, let T>s 
denote the set of actions available to 5, and assume that T>s x T>t Q T>sut 
when 5 n r = 0. [This is a superadditivity condition.] Take the set Ti of 
types for player z to be a finite set for all z G N; assume that all combi- 
nations of types [i.e., all n-tuples (^i,...,^n) € ilieivTi] occur with strictly 
positive probability. Write Ts = Uj^sTj for the set of profiles of types in the 
coalition 5. Let Ui(d, ti , . . . ,tn) be the payoff to player z G AT if the grand 
coalition N chooses strategy d G X>tv when players’ types are {h,. . . ,tn) G T. 
This model permits externalities, although there is no obvious way to define 
subgames except by having the subgames depend on some given type realiza- 
tion and action of the complementary coalition. [This situation is worse than 
in cooperative games with complete information in that, while a coalition 
can perhaps observe the action of its complement, the coalition may have no 
way to ascertain the type drawings of players who do not belong to the coali- 
tion.] Myerson (1984) further assumes the consistency condition of Harsanyi 
(1967-68) that there exists a probability p on T such that its conditional 
distributions satisfy Pi{^i{\ti) = p{t)/ where the summation is 

taken over S)i( G 7)i(. Then a cooperative game with incomplete informa- 
tion and player set N is defined by satisfying 

the above assumptions. Myerson (1984) studies bargaining solutions in such 
games. 

This model forms the basis for recent research by Allen (1993b), Ichiishi 
and Idzik (1992), and Rosenmiiller (1990), among others. As this work focuses 
on issues of incentive compatibility and, hence, relates to implementation, I 
do not discuss it further here. 
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1 Introduction 

The topic to be reviewed in this lecture is included in what Bob Aumann 
described in his lecture as the bridges between cooperative and noncoopera- 
tive theory. If I had all the time in the world, I would begin by presenting 
the basics of noncooperative game theory, but I cannot possibly do this. 1 
will therefore remain very elementary, and I will be somewhat loose about 
the noncooperative concepts.^ The flavor of what I will be doing today 
consists in writing down or describing game procedures, understood as non- 
cooperative mechanisms for interaction, discussion, and the formulation of 
agreements about how to split things. These bargaining procedures will be 
set in a context which will stay very close to the frameworks presented by 
earlier lecturers. We will then see how the noncooperative solutions of the 
bargaining procedures relate to the axiomatic procedures presented earlier 
by others. 

In interpreting these procedures, there are two positions that we can take 
— not quite two points of view, but two source of light with which we can 
look at this sort of theory. The first is the descriptive source of light, and 
the second is the prescriptive. 

The descriptive view In the descriptive view, the noncooperative pro- 
cedure comes first, not only logically but also conceptually and theoretically. 
We are discussing bargaining procedures, and when we analyze these proce- 
dures, we may discover that the equilibria exhibit some relationship with an 
axiomatically based solution. Then, if we wish, we may call the bargaining 
procedure under discussion the noncooperative foundation of the axiomatic 



^For general references on game theory, see Myerson (1991) or Osborne k. Ru- 
binstein (1994). 
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solution. But we certainly view the noncooperative approach as the concep- 
tual starting-point. 

The prescriptive approach The prescriptive approach relates more to 
implementation theory. (The elements of this theory will be presented in a 
forthcoming lecture.) Here, the point of view is to think of the axiomatic 
solutions as well founded on the axiomatic grounds on which they are pre- 
sented. However, one recognizes that to reach them, it may be necessary to 
design devices — call them bargaining procedures — that will yield the ax- 
iomatic solutions as noncooperative equilibria. So, logically, the cooperative 
part comes first, and we really think of the noncooperative part of the theory 
as an instrument with which to obtain the cooperative result. 

As I said, the distinction is very much there, but I would not want to 
trace a rigid boundary for the purposes of this lecture. 

In this first part of the lecture, I will discuss two player games, and in the 
second part, I will say something about iV-player games. 



2 Two-player gcimes 

In this section, we have two players, those in = {1,2} . I will start with 
transferable utility (TU) games, but I will move to the non-transferable utility 
(NTU) case very soon. 

2.1 TU case 

2olol Cooperative approach 

Think of two players that have to split a pie. If they cooperate, then they 
will get a total amount of utils, or dollars or whatever, equal to v {N ) . If 
they do not cooperate, then they will get a certain point (ci,C 2 ) = c € R. 

It would be tempting to adhere to the exact framework of a characteristic 
function, and write, say, Ci = v (1) and C 2 =v (2) . We could do this, but we 
will not. It is not clear that we should really think of Ci and C 2 as if they were 
exactly what (1) and v (2) would be in a cooperative framework. There is 
no need to regard Ci as what 1 could get by himself, and similarly for C 2 . We 
only require that the combination c = (ci, C 2 ) is what would happen if there 
were no cooperation. 

We do assume that 

Cl C 2 <v (N) 

so that there is some reason to cooperate. Graphically, we present this in 
figure 1. Here, the segment AB is the utility possibility frontier 
(til , U 2 \ui -\-U 2 =v (N) } . Now, let us make a slight conceptual jump, and 
let us associate the vector c with the threat point of a bargaining problem 
(like those presented in W. Thomson’s lecture). Then we see that each of 
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Figure 1 : Standard solution in the TU case 



the solutions that we have discussed from an axiomatic perspective has the 
property that it splits the surplus above c, rather than, say, giving each player 
{N) , Thus, we end up at the midpoint between a} and o? shown above in 
figure 1. We will call this split the standard solution: 



6 = 



{ ^ v{N)~ Cl- C2 

2 



C2 + 



V {N) — Cl — C2 
2 




where v {N) — Ci — C2 is the surplus that is split. 



2.1o2 Noncooperative approach 

There is a very simple way to obtain the standard solution noncooperatively: 
the all-or-nothing (or take-it-or-leave-it) mechanism. Consider the following 
bargaining procedure: 



• Choose one player by tossing a coin. Call this player the proposer. 

• The proposer proposes a split of v (N) : {ui , U2) . 

• The respondent accepts or rejects. 

— acceptance => (^1,^2)- 

- rejection => (ci , C2) . 



To solve this (extensive form) game, the natural noncooperative solution 
concept is backward induction (or, if you prefer, you can say that I am 
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choosing the perfect equilibrium of this game). If 1 is the proposer, then how 
will 1 reason? He will say, ”If I don’t offer 2 at least C 2 , then 2 will reject, 
because by rejecting she can get C 2 .” Therefore, 1 will propose the split that 
gives him the maximum amount compatible with 2 getting C 2 , and this is 
the point in figure 1. (We always assume that player 2 is cooperative 
enough with player 1 that she breaks ties in his favor, so I won’t have to 
worry about little e’s.) Thus, 1 will propose a^, and similarly, when 2 is the 
proposer, she will propose and get all the surplus herself. If we accept the 
von Neumann-Morgenstern expected utility theory, then, in expectation, the 
outcome is - {q} -|- a^) , which is exactly the standard solution defined above. 

So, are we done? Have we implemented the standard solution noncoop- 
eratively? The answer is, not quite. Why not? Well, the procedure just 
described has some drawbacks. One is very apparent, and the other will 
become clear momentarily when we move to the NTU case. The apparent 
drawback is that we get the correct outcome only in expectation. It is true 
that, ex ante, each player i gets hi, but the proposals that actually take place 
in the game are not the standard solution. They are either or a^. (Ac- 
tually, in the current TU case, there is an easy fix for this problem — just 
perform the procedure twice instead of once — but as we will now see, the 
matter is not always so simple.) 

2.2 NTU case 

2o2ol Failure of the take- it- or- leave- it procedure 

We draw the problem graphically as before (figure 2). Note that, with 




Figure 2: 



Failure of the one-stage mechanism in the NTU case 
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and €? defined as before, the 50/50 expectation of the two outcomes is not 
even efficient. Of course, we could get efficiency by not randomizing, and by 
simply imposing that 1 begin, or 2 begin, but then we would not implement 
the standard solution. While this is a very simple example, it illustrates an 
issue that tends to be a general problem. If you are in the TU case, then you 
can implement (in expectation) by relatively short bargaining procedures; but 
these procedures are not likely to be efficient if you have an NTU problem. 

2o2o2 A multi-stage procedure 

At this juncture, it seems logical to argue that if we want to get better out- 
comes, perhaps we should work with more elaborate bargaining procedures 
— in particular, bargaining procedures that keep repeating themselves, so 
that if somebody rejects, then this is not the end of the world; another round 
of negotiation may yet take place. I will now present a particular, but typical, 
multi-stage procedure. It goes as follows: 

• As before, a player is chosen by tossing a coin, and she makes a (feasible) 

proposal {ui,U 2 ) £V (N ) . 

• The other player can accept or reject. 

— Acceptance => (wi, ^ 2 ) • 

— Rejection => 

* With probability p < 1, the game repeats. 

* With probability 1 — p, the players get c. 

Note that in the case of rejection, the probability of breakdown is not 
1, but only 1 — p < 1. This number (which is a parameter of the problem) 
could be large or small, but I want you to think of it as small, so that if 
there is persistent rejection, then with high probability, the procedure will 
not terminate with breakdown immediately, but will do so only quite far in 
the future. A typical interpretation (but not the one that I want to emphasize 
here), is to think of 1— p as a rate of time-discounting. (In this case, we should 
interpret c as the utility that will be obtained if there never is agreement.) 
The point is that there is a cost of delaying agreement one round. This cost 
may be that of time passing, or it may be something else. For example, in 
the case of implementation, the designer can set a device which incorporates 
the possibility of breakdown. 

The procedure just described is not the only possible one. Note, in par- 
ticular, that it is time-stationary. We could, for example, also have a non- 
time-stationary rule. Fix a horizon T < 00 , such if there is no agreement by 
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time T, everything stops and players get c. However, assume that up to time 
T there is no cost of delaying agreement. This is somewhat discontinuous 
(and nonstationary), and it doesn’t lend itself to a simple analysis, whereas 
the device proposed above does. 

20203 Stationary Perfect Equilibrium 

Which solution concept will we adopt? We have a game that will terminate 
with probability 1, but which is, in principle, infinite. There is no end of time, 
and therefore I cannot use backward induction. Instead, I will adopt a very 
simple solution concept called stationary perfect equilibrium. Perfect implies 
that we are still within the framework of backward induction. Stationary 
means that we are focusing on equilibria where chosen strategies do not 
depend on history or on calendar time. Proposals will be independent of 
whatever has happened in the past. Similarly, the responses will depend 
only on the proposal received, and not on past proposals. 

Finally, note that we are still talking about an equilibrium. That is, I am 
referring to an equilibrium, in which the strategies happen to be stationary 
ones, but it is a true equilibrium. In particular, there is no restriction on 
the strategy set, and players do contemplate the possibility of every sort of 
complicated nonstationary deviation. In the universe of perfect equilibria — 
where every equilibrium is as good as any other — the equilibria that are 
most descriptively simple are the stationary ones. So just as, when one looks 
at a dynamical system, one first looks at the rest points, it makes some sense 
to look at the stationary equilibria first. 

20204 Graphical solution by equilibrium equations 

I’m going to try to solve the equilibrium problem graphically. The treatment 
is not meant to be rigorous. Focus on a particular stationary perfect equilib- 
rium. Call b= (6i, 62) the expected payoffs at t = 0 when this equilibrium is 
played. Since the utility possibility set is convex, b must be a feasible point, 
but it does not need to be at the boundary of V (N), and in figure 3, it is 
shown in the interior. Remember that we do not know a priori that efficiency 
is guaranteed (and in fact, as we will see, it is not). 

Now suppose that player 1 is chosen to be the proposer. What will 1 
propose? He will try to evaluate how much it costs 2 to reject. Well, if 
2 rejects I’s proposal, then with probability p, everything is repeated, and 
because of stationarity, we come back to b. With probability 1 — p, we go to c. 
So the expected payoff* vector is p6-h( 1 — p) c. Hence, by rejecting I’s proposal, 
2 can guarantee herself p^2 -h (1 — p) C2. Player 1 will therefore propose the 
point (shown in figure 3) that maximizes his own payoff subject to 2 getting 
at least p^2 4- (1 — p) C2, the minimum payoff that guarantees 2’s acceptance. 
Similarly, 2 will propose the point a^ that maximizes her payoff, subject to 
1 getting at least pb\ -(- (1 — p) Ci, which guarantees that 1 accepts as well. 
Note that along the equilibrium path, there will be no rejection. 
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Figure 3: Graphical analysis of the mult-stage procedure 

We conclude that, if 1 is the proposer, the outcome is a^, and if 2 is 
the proposer, the outcome is a?. So far, we have not brought any consistency 
conditions into the analysis. To close the system, we notice that the expected 
outcome of the equilibrium is h. Therefore, we must have 

& = ^ (o^ + a^) 

That is, the vector h must lie at the midpoint of the line segment between 
q} and 0 ?, This is a real condition. If I start with an arbitrary 6, and then I 
construct q} and o? (as indicated above) and take their midpoint, I need not 
come back to 6, and in figure 3, I do not. If I do happen to come back to 6, 
then I have found an equilibrium. This is the case in figure 4. 

The stationary equilibrium payoff vector h depends on p, but it can be 
verified that, given p, h is unique. Note that it is not efficient. 

At this point, we can observe something very interesting. Normalize c to 
(0,0) (this is just for convenience). Take the straight line through a} and a^, 
and extend it until it hits the axes (figure 5). The two triangles BOA and 
a^Do? are similar, and the line Oh splits them in half. Now, since h is the 
midpoint of the hypotenuse of a} Do?, it follows that h is also the midpoint 
of the hypotenuse of BOA. Now imagine that 1 — p is very small, so that the 
triangle a^Da? is also very small. Then the slope of AB is almost equal to 
the slope of the boundary of V {N) near and (assuming that i ' (N) has 
a smooth boundary). 

So, for 1 — p very small, we have, almost, the following property: The 
equilibrium payoffs b are efficient and are such that when we take the tangent 
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Figure 4: Elquilibrium condition for the multi-stage procedure 




Figure 5: Approximate efficiency for large p 
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to the boundary of V (N) at the equilibrium payoffs, b falls at the midpoint 
of this tangent (more precisely, the midpoint of the segment of this tangent 
that lies between its intersections with the axes). We should recognize this 
as the defining property of the Nash bargaining solution. Recall that for a 
utility possibility set as illustrated below (figure 6), if the vector b has the 
property that the segments Ab and bB have the same length, then it follows 
that b is the Nash solution. This comes right out of the axioms of the Nash 




Figure 6: The Nash solution 

solution. Consider first an economic budget set — a straight line (See figure 
6) . Then if we maximize the product of the coordinates on this line, we come 
to the midpoint. So b is the Nash solution for the budget set. But then 
the contraction independence axiom implies that, since when we go from the 
economic budget set to V (N ) , we only make the utility possibility set smaller 
while b remains feasible, the solution remains b after the change. 

We conclude that if the cost of renegotiation is very low, then at the first 
stage of the stationary perfect equilibrium of the bargaining procedure, the 
proposer will propose a payoff which is close to the Nash bargaining solution, 
and the respondent will accept. We emphasize that: 

1. We obtain this result because, in principle, negotiation can go on for a 
very long time. But in fact, it will not go on for long. It will end in the 
first round. 

2. The proposer does not really matter. Both agents will propose almost 
the same outcome. 
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2.3 Some Remarks 

2o3ol On the stationeirity restriction 

I have been talking about stationary perfect equilibria. In fact, there is a 
notable result due to Rubinstein (1982), which asserts that, for this model, 
to get the equilibrium payoffs we derived above, the stationarity restriction is 
actually not required. Every perfect equilibrium of this procedure has exactly 
the characteristic that we just described: On the equilibrium path, one player 
(whoever is the proposer) makes some proposal (the same for all equilibria) , 
and this proposal is accepted. This is a very singular result, but I will not 
emphasize it. It is quite remarkable, but something of a jewel — admirable 
and beautiful, but hard to replicate. In particular, it does not generalize to 
more than two players. 

2«3o2 A dynamic analysis 

Associated with the previous discussion there is a nice dynamic analysis. 
I can not be precise here, but M. Maschler and collaborators have done 
much research on this topic. Suppose that you take exactly the bargaining 
procedure I have presented, except that you truncate it at period T ';$> 0, 
when the world ends. Thus, we are contemplating a nonstationary problem 
such that the disagreement point is, say, 0, and such that at ^ = 0, we have 
a utility possibility set Vq = V (N). Then, there is a contracting sequence of 
utility possibility sets Vi, V2, • • • (figure 7) such that each Vt = p^V (N) 
is the set of feasible expected payoffs if there is no agreement before time t. 
Assume also that T is so large that is nearly equal to 0. 

This problem can be solved by backward induction. You just think about 
what would happen at the end of the world, and then given that, you look 
at stage T — 1, etc. Figure 7 illustrates the construction of the equilibrium 
expected payoffs 6(T— l)att = T— 1 from the equilibrium expected payoffs 
h (T) = ^ [a^ (T) -h a? (T)] in the last period. You can proceed in this manner 
until, finally, you derive h (O) . The process will begin to look like a differential 
equation. We can then make the jump to real differential equations, so that 
the backward induction yields a system of differential equations which, as it 
turns out, converges to the Nash solution. 

2o3o3 Variation in the breakdown point 

I have assumed that the breakdown point c is given independently of the 
history that leads to breakdown. I could consider (why not?) a more com- 
plicated model in which the breakdown point depends, for example, on who 
has been responsible for the breakdown, perhaps the refuser or perhaps the 
last proposer. One can think of many variations. But let me focus on one. 
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Suppose that V (N) is as before, but that the breakdown point depends 
on who the last proposer was before breakdown. Otherwise, the bargaining 
procedure is as before. Again we have a p. The only change is that as we 
go through time, if player 1 makes a proposal, 2 rejects it, and there is 
breakdown, then we go to some point c^. If the breakdown happens after 2 
proposes and 1 rejects, then we go to instead (figure 8). Note that if you 
think of p in terms of time discounting then this doesn’t make sense, but 
for other interpretations, it does make sense. When 1 evaluates the utility 2 
gets from rejecting, he should consider c^. If 2 rejects, then occurs with 
probability 1 — p, and play continues with probability p. It turns out (this is 
very easy to check) that to solve this model, we can proceed by constructing 
a kind of fake disagreement point (shown in figure 8): c = (cf , 4) . Then we 
can continue exactly as we did before. Taking c as the disagreement point, if 
p is large, the outcome will be nearly the Nash solution calculated from this 
fake disagreement point. Note that this disagreement point has no reality. 
Outcome c will never occur. What can occur is or c^. But the theory can 
still make use of the fake disagreement point. 
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Figure 8: The breakdown point depends on the last proposer 



One interesting point that I would like to mention is that there is no reason 
why things have to be as I drew them in figure 8. In fact, it could be as in 
figure 9. The only requirement is that and (? be in the feasible set V (N). 
But, as constructed in figure 9, c need not be feasible itself. But we can still 
look at the bargaining procedure and derive its stationary equilibria. What 
will we get? The equilibrium will have to satisfy exactly the same equations 
as before. Given an equilibrium payoflF vector 6, we construct and as we 
did earlier, and it must be the case that b is at their midpoint. (See figure 9.) 
If 1 — p is small, then to find something that is almost the solution, you look 
at c as a disagreement point, and you look at the point b on the boundary 
of V (N) such that when you take the tangent at 6, 6 is at the midpoint of 
the line segment AB shown in figure 10. Two things are worth noting about 
this construction. First, it amounts to guaranteeing the first order conditions 
(but not the second!) of the Nash product ’’maximization problem.” Second, 
the solution need not now be unique. 

2o3o4 What about Kalai-Smorodinsky? 

My entire discussion has led us to the Nash bargaining solution. You heard in 
W. Thomson’s lecture that the Kalai-Smorodinsky solution is as important as 
Nash’s, so you may ask whether I can get Kalai-Smorodinsky ’s solution by a 
bargaining procedure similar to the one I have described above. I cannot give 
you an affirmative answer to this question. I could offer you some bargaining 
procedures, but these would be of a very different character. However, I 
can offer you some insight by obtaining a solution which is in the spirit of 
Kalai-Smorodinsky. 

We do this as follows: Put p = 1, so that there is no cost of delay. To avoid 
being degenerate, also assume a fixed time horizon T < oo; i.e. we repeat only 
T times. We now apply backward induction. Consider the last period. If 1 is 
the proposer, then 1 will offer (T ) ; similarly, 2 will offer (T), as shown 
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Figure 9: Disagreement point outside feasible set 




Figure 10: Near efficiency with a fake disagreement point 
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in figure 11. Therefore, if there is rejection at T — 1, the expected payoff is 
the midpoint b (T) = ^ [a^ (T) 4- (T)] . What will happen in period T —11 

Both players know that if the other rejects, then they get h (T ) , so player 
1 proposes (T — 1) shown above, and 2 proposes o? (T — 2) . We continue 
this construction to get (T — 2) , (T — 2) , etc. As T grows large, we 
approach the boundary. This is not the Kalai-Smorodinsky solution. It is 
called the Raiffa solution, but it clearly seems to be in the same general 
category as Kalai-Smorodinsky ’s. 



3 N-Player Games 

I will now move to N players. The theory is less settled here, and so I will 
be much more particular and merely illustrative. There is much work on 
this topic, and I cannot possibly cover all the available results. Fortunately, 
the lecture by P. Reny will also touch on this general area, and he will 
complement our discussion quite well. The presentation from now on is in the 
spirit of implementation theory. Thus, I will just present instances of how, 
under certain restrictions, this or that solution concept can be supported by 
a noncooperative procedure. But I will not discuss whether such a procedure 
is sufficiently descriptive of ’’real bargaining.” 
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3.1 Background 

Since I want to relate my discussion to cooperative game theory, let me begin 
by describing the physical situation that is being contemplated in terms of 
the formalism of the characteristic function. I assume that there are N 
players and that for all S C N, V {S) is the attainable set for S. I want to 
be a bit more precise here. Thus, in the spirit of an economic approach, let 
us think of this V as describing an economic situation in which people are 
endowed with resources and where there are no externalities of any kind, so 
that V {S) is the set of utility possibilities that the group S can reach by 
using the resources of its members (this set is independent of whatever the 
players not in S eventually get). 

I am focusing on this particular scenario because it is a very simple one 
in which there is no complication whatsoever in interpreting what the char- 
acteristic function is. In general I think that for the type of discussion that 
we are carrying out, it is indispensable to have an interpretation, a partic- 
ular story about this characteristic function. Otherwise it is very hard to 
know what one is talking about. For example, in a model with externalities 
and other interactions, the characteristic function may have been constructed 
taking into account a certain number of strategic considerations that concern 
how the members of a coalition and its complement will behave. In this case, 
it would be artificial to analyze the problem using other strategic considera- 
tions. If we are focusing on the core, then the kind of strategic considerations 
that go into its definition may also be strategic considerations that lead us 
to a certain kind of characteristic function; while for the Shapley Value, say, 
we may be led to another characteristic function. I think, therefore, that 
you cannot separate the construction of the characteristic function from the 
solution concept that you are going to use, except in completely natural cases 
like pure resource problems. Since matters are already complicated enough 
here, we will stick to this case. 

3.2 Two approaches to the N-player case 

In the spirit of what I have done so far, I want to generalize the bargaining 
procedures presented earlier, and in a manner that fits in with the lessons of 
cooperative game theory. This already points one in certain directions. For 
example, in the N = 2 case, I focused on the Nash solution. That means 
I have already focused on a single- valued solution. It would not be natural 
now to move in the direction of the core, which is multi-valued even in the 
case of two players, and which certainly does not equal the Nash solution. In 
fact, the approach I am taking is directing me towards value-type solutions, 
and I will not resist this direction. 

Let me remark that, in my view, it is not yet well understood what distin- 
guishes the types of bargaining solutions that take us to the core from those 
that take us to value. Very vaguely, my impression is that the distinction has 
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something to do with the meeting technology of the players — that is, with 
how different players meet. If the choice of players that meet is very strategic 
— namely, a player chooses and looks around for partners — then I think 
that we are pushed towards core-like notions; while if the meeting technol- 
ogy is that people meet at random somehow, so that people have very little 
choice about how they find their companions, then I think we are pushed 
more towards value solutions. 

From now on, we have N players, and they need to meet to bargain over 
how to split some sort of pie. To analyze this situation further, it is critical 
to establish how players meet. In the value-oriented literature, we find two 
varieties of what we could call ’’meeting technologies.” Because I am more 
familiar with one of them I will follow that one, but I must mention both. 
They both generalize the formulation we adopted in the two-player case. 

The first is the technology of pair-wise meetings. The meeting of pairs 
(the ’’buyer” and the ’’seller”) is pervasive throughout economics (cf. Gale 
1986, Rubinstein Sz Wolinsky 1985). For the current problem, the technology 
of pair-wise meetings has been used by Gul (1989). In his important paper, 
there is a collection of people who have resources and who meet at random 
in pairs. When they meet, one proposer is chosen at random. The proposer 
makes a proposal to buy the resources of the respondent. The respondent 
may accept, in which case he disappears from the game with the payment, 
or he may not accept, in which case both members of the pair go back to the 
pool of players and negotiation continues in this manner. 

The second technology, which is the one I will adopt, is that of multi- 
lateral meetings. More precisely, I mean by this that at any point in time, 
there is an assembly of all the bargainers, and the proposer (chosen in some 
manner) addresses the entire assembly. I will follow this approach, but I also 
recommend that you look at the Gul paper. 

3.3 An illustrative example 

3o3ol Setup 

I am going to present, as an example, a bargaining procedure which is a 
generalization of the previous (2-player) bargaining procedure. It is taken 
from Hart & Mas-Colell (1992). The key feature of this procedure is that 
players may drop out throughout the negotiation process. 

The procedure is as follows: Assume that S C N is the set of players still 
involved in negotiation. Initially, we will just have S = N, 

• Choose a player i at random from S using a uniform distribution. 

• Player i proposes a payoff vector u gV (S) , 
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• Other players are asked (sequentially) if they agree or dissent. 
— All agree => u is implemented. 

— Any player dissents => 

* With probability p, the game repeats. 

* With probability 1 — p, breakdown occurs. 



What does breakdown mean? I don’t want it to mean, as before, that 
everything is ended and that we go to some disagreement point c £ V (S) . 
I avoid this meaning because I want subcoalitions to matter. There are 
a multitude of other meanings of ’’breakdown” that one could consider. I 
encourage you to consider some. For example, ’’breakdown” could mean that 
some player is at risk of disappearing. It could be that there is some ^ < 1 
such that each player disappears with his own resources with probability 6. 
If 5 is small, then the probability of two players disappearing simultaneously 
is negligible, and so breakdown means that one of the players, chosen at 
random, will disappear, and we will go on with a smaller game. 

To be specific, I will focus on another particular meaning of breakdown. 
I choose this particular example purely because it fits with the analysis of 
the Nash solution I presented before, and because I want to tie this analysis 
to the Shapley Value. Thus, I will present a breakdown technology which 
has the feature that if I look at the equilibria, then in the pure bargaining 
case I get the Nash solution, and in the case of transferable utility — another 
leading case for analysis — I get the Shapley Value. 

The breakdown technology which I will use, and which, I could argue, 
is the only technology which works for this purpose, is the following: With 
probability 1 — p, the proposer disappears (taking with him and consuming 
his own resources). The game then repeats with only the players in S\ {i} . 
So proposers that are frivolous enough to invite rejection run the risk of being 
out of the game. At the same time, of course, they are not always thrown 
out of the game because they have resources that the other players value. 

As before, I will look at the stationary perfect equilibrium. If I could do 
with perfect equilibrium, I would be happier, but unfortunately, in the games 
that we are analyzing, the set of perfect equilibria is large (if N > 2). 

3o3o2 The equilibrium conditions 

How do we analyze problems like this? We already know how to determine 
the stationary perfect equilibrium equations. For the case iV = 2, I drew a 
picture (figure 4). Now, since there are many more than two equations, I 
cannot draw a picture, but I can still write down the equilibrium equations 
without any difficulty. The logic is the same. 
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The equilibrium objects are the proposals. For every foreseeable coalition 
S of players still in the game and for every i^j G 5, let represent the 
proposal that i makes to j if 2 is the proposer. Let’s see now what sort of 
consistency conditions these numbers need to satisfy. Define the expected 
payoff of the players in S to be 

This is the average of all the payoffs that result if all players’ offers are 
accepted. There will be two equilibrium conditions: 

1. For all S and 2 G 5, G dV (S ) ; i.e. * is efficient, and is therefore 
on the boundary of V [S) and not in the interior. 

2. If 2 G 5 is the proposer, then he will offer j the minimum possible 

payoff, which is what j would get if she rejected. He must make sure 
that j will not reject. If j rejects, then with probability p, everything 
will be repeated, and she will get a^, and with probability (1 — p) . t 
will be thrown out of the game, and j will expect to get instead. 

Thus, 

af ' =paf + {I- p) (1) 



We can then solve the system using conditions 1 and 2. Notice that the 
result is a stationary perfect equilibrium. 

3o3o3 A Remark 

If p is very close to 1, then the term (1 — is very small. So, the 

proposals to j will depend on who the proposer i is, but in fact, no matter 
who 2 is, this proposal will be very close to the average , which does not 
depend on i. Therefore, the proposals of all the players in S will lie close 
together on the boundary of V (S ) , so that the average of these proposals, 
will be almost efficient, as illustrated in figure 12. If we examine the 
equilibrium equations, then as long as p is close to 1, we see that, first, it 
won’t matter very much who the first proposer is and, second, the average 
proposal will be approximately efficient. 

3,3«4 The TU case 

Do we know anything else about the case in which p is large? Well, in the 
pure bargaining case for two players, once we defined the disagreement point, 
the stationary equilibrium of this model was the Nash solution. The same 
argument generalizes to N players if the strict subcoalitions cannot generate 
gains from trade (the pure bargaining case). But we are interested in the 
general situation in which the worth of a subcoalition matters. 
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Figure 12: Approximate efficiency for large p 

Consider the well understood TU case. Here, it turns out that, for any 
p, the stationary equilibrium payoffs are the Shapley values. This is, inci- 
dentally, why I chose this particular breakdown technology. While this result 
holds for all p, remember that in order to make sure that the proposals them- 
selves (not just their average) are the Shapley values , we do need p to be close 
to 1. I will try to provide some intuition for this result. If you are familiar 
enough with the Shapley value, then you look at the system of equations and 
say, ”0f course.” But let me argue directly from the axioms. The Shapley 
value is characterized by four axioms. 

lo Efficiency This is guaranteed for equilibrium payoffs because all the 
equilibrium proposals are on the boundary and the boundary is flat, so that 
the expectation is also on the boundary. 

2o Equal Treatment I have never distinguished any particular player 
from any other. Thus, clearly, the ’’stationary perfect equilibrium payoffs” 
solution must be symmetric. 

3o Additivity (Linearity) Write down the system of equations. In the 
linear (TU) case we simply have, for all S and i £ S, 

Y^af^=v{S) ( 2 ) 

j^s 

and therefore 

j^s 

A quick examination of (1) and (3) reveals that we can solve these for all the 
a^^’^’s recursively. We will thus obtain some complicated expression, but this 
expression will be linear in the (5)’s. 

4« Dummy Axiom This is where our particular breakdown technology 
comes into play. Intuitively, this technology implies that when some i £ S 
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makes a proposal to the players in S', if she has nothing to contribute to 
the other players, then these players do not pay any cost when they force 
a delay by rejecting her offer. Either the game is repeated, or z is kicked 
out, which makes no difference since the pie remains the same. Thus, the 
procedure gives no power to a dummy player. In a bargaining procedure, a 
player can have two types of power: one that derives from her resources, and 
another that derives from her ability to prevent agreement. A dummy has 
no resources, and this bargaining procedure makes the delays caused by her 
harmless to the other players. She therefore has no power of either kind. 

Here is a formal proof that a dummy player gets nothing in a stationary 
perfect equilibrium. The proof is by induction. Suppose that the claim holds 
for all games up to size N —1. I will prove that it also holds for games of size 
N. Suppose that i is a dummy. I need to show, first, that when i proposes, 
he proposes 0 for himself, and, second, that when another player proposes, 
the proposal to i is 0. 

Suppose that i is the proposer. How much will i propose to the other 
players? Adding the equilibrium equations (1) above, we get: 

N\{i} 

'3 






Then substituting using (3), we get 

v{N) -a^''=p[v (N) - af] +(!-/>) y{N\{i}) =v{N)- pa^ 

= v(N) 

since z is a dummy. Therefore, i proposes for himself 

N 

a. =pa^ 

Now suppose that j i is the proposer. Then, again using (1), we have 
a^’^=paf + {l-p)4^^^^ 

Because z is a dummy, and N\{j} has only — 1 players, the induction 
hypothesis tells us that = 0. Therefore, in parallel to what we derived 

above, we have 

N.j N 

= PO'i 

and this is true for all j ^ i. 

Hence, whether i is the proposer or not, he gets p times his expected 
value. So by definition, 

^,3 N 



N ^ NJ 

' ^ iv E 



j£N 
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which implies that 

af =0 because p < 1 

Therefore = p • 0 = 0 for all j £ N. I 

We have now shown that all the axioms of the Shapley value are satisfied. 
Therefore, the outcome must be the Shapley value. 

3o3o5 Closing Remarks: The NTU Case 

I repeat that what I have shown you her is only an example. However, it 
prompts a question: We have considered a bargaining procedure which in 
familiar cases gives very familiar solutions. Can we now do some concep- 
tual boot-strapping, so that, after using cooperative theory to motivate a 
non-cooperative procedure, we can then go back to cooperative theory and 
discover what the particular non-cooperative procedure yields in the general 
NTU case? Interestingly enough, for p 1, this procedure yields the ’’con- 
sistent solution” introduced by Maschler Owen (1992). It does not yield 
either of the two more familiar solutions: the Shapley NTU value solution 
or the Harsanyi solution. Note that our particular bargaining procedure was 
not designed to yield the consistent solution, and that Maschler and Owen 
derived it from very different consistency- like requirements. Now you can 
ask, ’’What is the consistent solution?”. I could spell it out for you, and I 
could also add: 

Theorem: When p is close to one, the stationary perfect solution of the 

bargaining procedure is close to the consistent solution. 

But since I have only —2 minutes left, I am going to simplify matters 
by transforming a theorem into a definition, and I will answer your question 
by saying that the consistent solution is the limit as p — > 1 of the station- 
ary perfect equilibrium of the particular bargaining procedure that I have 
described! 
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1. Some General Results 
1.1 Introduction 

What is implementation theory all about? To answer this question we shall 
follow Moore (1991) by describing a classic problem known as ’’King Solomon’s 
Dilemma.” As it happened, two women approached Solomon with a newborn 
child. Each claimed to be the child’s mother. It was up to Solomon to decide 
which one was telling the truth. In his wisdom, Solomon had the women lay 
the child before him. He drew his sword and announced that he would settle 
the dispute by cutting the child in half. However, just before he brought down 
his sword the true mother begged that he spare the child’s life and give it to 
the impostor. Knowing that only the real mother would be willing to give up 
the child rather than allow it to die, Solomon gave it to her. Such is the nature 
of an implementation problem which we now describe in rather general terms. 
An implementation problem consists of the following: 

1. A finite set of agents N = {1, 2, ...,n}. 

2. A finite set of states of the world, S. 

A state may include whatever is relevant for the problem at hand. This may 
include only the preferences of the n agents while it might also include their 
endowments etc. 

3. A finite set of social choices, C. (For King Solomon, a choice specifies which 
woman gets the baby.) 

4. A social choice correspondence (SCO) f : S ^ 2^ . (For King Solomon, it 
is single- valued; i.e. a function. An economic example is the Pareto correspon- 
dence.) 

This framework covers problems involving the provision of public goods, 
optimal taxation, auction design, monopoly pricing, voting theory, bargaining, 
contract theory, agency theory, etc. 

In general, a ’’planner” wishes to induce, for each state of the world (hereafter 
simply a state), s, a social choice (hereafter simply a choice) in f{s) C C. The 
difficulty is that the planner does not know the true state s. In our complete 
information setting we assume that the agents 1,2, ...,n know the true state. 
When S includes all agents’ preferences over C this is a very strong assumption. 
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In some circumstances this assumption is justified (the two mothers in King 
Solomon’s dilemma, for example). When it is not, the theory of implementation 
under incomplete information is the relevant framework within which to analyze 
the problem. We shall stick with the complete information case however. 

What does it mean for a planner to ’’induce a choice in /(s)?” How can 
he do this? One way is to have the agents (who know s) participate in a 
game (designed by the planner) in which they are driven to reveal their private 
information through the course of playing the game ’’rationally.” Since the 
planner must design the game in ignorance of 5, the rules of the game cannot 
depend on the true state in any way. 

1.2 Nash Implementation 

A (simultaneous) game form (rules of a game), denoted G, endows each agent i 
with a set of messages, and specifies for each n— tuple of messages chosen 
by the agents, a choice in C. The latter is represented by a function g : x 

... X -^C. 

The issue then is this: Given a SCC, /, does there exist a game form, G, 
such that 

(*) for all 5 G 5 the set of ’’equilibrium” outcomes of the game (G, s) is precisely 
f{s). (Note that the set of equilibrium outcomes of the game may change as s 

changes since s determines the agents’ preferences.) 

Up to now I have avoided specifying an equilibrium concept. We’ll say 
that a SCC is Nash implementahle if there is a game form satisfying (*) where 
equilibrium there is Nash equilibrium. We similarly define subgame perfectly 
implementahle SCO’s, dominant strategy implementable SCO’s, etc. 

Let’s now return to King Solomon. Does Solomon’s scheme of threatening 
to cut the child in half meet the demands we have set forth above? Specifically, 
has Solomon succeeded in Nash implementing the desired outcome? No. By 
keeping quiet, the impostor failed to get the baby. Clearly the impostor can 
do no worse by mimicking the true mother. What would Solomon have done if 
both women had begged him to give the child to the other? Unfortunately, the 
biblical account is silent on this issue. 

Can we fare any better than Solomon at solving his dilemma? Is King 
Solomon’s SCC Nash implementable? The following analysis is taken from 
Moore (1991). The elements of the implementation problem are: N = {Ann, 
Bess}, G = (a, 6, c, d}, S = {a,/?}, f{a) = a, /(/?) = 6, where the choices 
a, 6, c, d are respectively, Ann gets the baby, Bess gets the baby, the baby is cut 
in half, death to all; and the state a (/?) denotes that Ann (Bess) is the true 
mother. In each state we assumg"that getting the baby is the best choice and 
death to all is the worst choice for both Ann and Bess. In state a (when Ann 
is the mother) we aissume (as did Solomon) that Ann prefers b to c, and that 
Bess prefers c to a. Their preferences are reversed in state p. 

We’ll now argue that the SCC / above is not Nash implementable. If it were, 
then some game form would do the job. Let the game form be represented by 
a matrix whose entries are the outcomes and where Ann chooses the row and 
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Bess chooses the column. For simplicity, consider the 3x3 matrix below, where 
most entries are intentionally left unspecified. 



Bess 



a 


X 


? 


? 


7 


? 


? 


? 


? 



Since the matrix game implements /, it must yield precisely choice a as a Nash 
equilibrium in state a. Thus choice a must occur as one of the entries, which it 
does. But if this corresponds to a Nash equilibrium, then the entry labelled x 
(and indeed every other entry in the first row) must either be a or d. Otherwise 
Bess would deviate in state a. But this implies that a remains an equilibrium 
outcome in state f3^ even though /(/?) = b. Consequently, this game does not 
Nash implement /. This argument is completely general and therefore estab- 
lishes that Solomon’s SCC is not Nash implement able. 

What is it about Solomon’s problem that renders it nonimplementable? The 
key observation is this. When the state switches from a to /?, choice a moves up 
(weakly) in both agents’ preferences. Hence regardless of the game form, if a is a 
Nash equilibrium in state a, then it remains an equilibrium in state /?. However, 
a ^ /(/?) and so / cannot be Nash implemented. This important insight was 
first recognized by Maskin (1977) where a distinction is drawn between SCO’s 
that are monotonic and those that are not. Formally, a SCC f : S C is 
monotonic if whenever a G f{s) and a ^ /(^), there is an agent i and a choice b 
such that i weakly prefers a to b in state 5 , but strictly prefers b to a in state t. 
Clearly Solomon’s SCC is not monotonic and this accounts for our inability to 
Nash implement it. Thus our argument above in the case of Solomon’s problem 
has established the following general result. 

Lemma 1.2.1 (Maskin (1977)): If a SCC is Nash implementablcy then it is 
monotonic. 

For the moment, we’ll leave King Solomon and very briefiy consider another 
SCC, namely the Pareto correspondence. Consider an exchange economy with 
a fixed set of agents and endowments. Let f{s) denote the set of Pareto efficient 
allocations in state s, where the agents’ preferences are determined by the state 
s. Is / monotonic? Yes. Consider a G f{s) such that a ^ f{t). Then the latter 
relation implies that there is a choice b such that every agent strictly prefers b 
to a in state t. However, the former implies that there is at least one agent i 
who weakly prefers a to b in state s. Thus the Pareto correspondence satisfies 
this necessary condition for Nash implementability. But is the Pareto correspon- 
dence Nash implement able? If monotonicity were a sufficient condition for Nash 
implementation then the answer would certainly be, yes. However, monotonicity 
is not sufficient for Nash implementation (see Maskin (1985) for an example), 
but it is almost enough. Maskin (1977) proves the following remarkable result. 
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Theorem 1.2.2 (Mciskin (1977)): When there are three or more agents, any 
monotonic SCC satisfying no veto power is Nash implementahle. 

A SCC / satisfies no veto power if whenever choice a in state s is top ranked 
for all but perhaps one agent, then a G f{s). Together with Lemma 1.2.1, 
this provides nearly a full characterization.^ Since the Pareto correspondence 
satisfies no veto power and monotonicity. Theorem 1.2.2 establishes that it is 
Nash implementahle when there are at least three agents. 

Proof. The proof is constructive. Let Ri{s) and Pi{s) denote agent i's 
preference relation and strict preference relation, respectively, over C in state 
s. The following game form implementing / is slightly modified from that of 
Moore (1991) whose presentation is based on Repullo (1987). 

Each agent announces a state in S', a choice in C, and an integer. 

1. If all agents announce the same state s and the same choice a G f{s), then 
the outcome is a. 

2. If all agents but one agree on s and a G f{s), then a is still the outcome 
of the game unless the remaining agent i announces a choice b such that 
aRi{s)b, in which case the outcome is b. 

3. Otherwise the outcome is the choice announced by the agent who an- 
nounced the highest integer. (Ties are broken in any previously specified 
manner.) 

We first show for every a and s with a G /(s), that a is an equilibrium out- 
come when the state is 5. Indeed consider the common announcement ( 5 , a, 1). 
If all agents make this announcement in state 5 , then any individual agent i 
can change the outcome from a to b only if aRi{s)b. Consequently, no agent 
can profitably deviate. Thus, it is a Nash equilibrium to make the common 
announcement above in state s. This then yields the outcome a in state 5 . 

To complete the proof we must argue that if a is an equilibrium outcome in 
state 5 , then a G f{s). So, let a denote the equilibrium outcome in state 5 , and 
assume by way of contradiction that a ^ f{s). Then, by no veto power, a is not 
top ranked in state s by at least two agents. Consequently, all agents must be 
announcing the choice a and the same state, t say, where a G f{t). Otherwise, 
by part 3 of the game form one of the two agents could deviate (by announcing 
the highest integer) and obtain his top ranked choice. Now, since a G f{t) and 
a ^ /(«§), monotonicity implies that there is an agent i and a choice b such that 
aRi{t)b and bPi{s)a. But this means, by part 2 of the game form, that agent i 
can deviate by announcing ( 5 , 6, 1) and render the outcome of the game b which 
he strictly prefers in state s to a. But this contradicts our assumption that a is 
an equilibrium outcome in state s. M 

Although ingenious, the game form used to implement the SCC / in the 
proof above has a drawback, namely the integer game that comes into play in 

^For a complete characterization see Moore and Repullo (1990). 
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part 3 of the description of the game form. The integer game is included in order 
to rule out unwanted equilibria. However, in practice when confronted with the 
opportunity to play an integer game, the fact that a Nash equilibrium fails to 
exist might not be reason enough not to choose to play it. For instance consider 
a situation in which two players can choose to play an integer game at a cost 
of $1 each. The status quo prevails if one of them decides not to play. If both 
decide to play, then the one who names the highest integer receives $1000.00. 
The only Nash equilibrium of this game is for both players to decide not to 
play. But is it irrational to decide to play? How confident would you be that in 
practice intelligent players would always decide not to play? The integer game 
in the proof of Maskin’ s theorem rules out unwanted equilibria in precisely this 
fashion. Thus although technically the game form provided above does Nash 
implement /, one might well be suspicious of the practical success such a game 
form would enjoy. For an excellent critique of integer games and the like see 
Jackson (1992). 

1.3 Subgame Perfect Implementation 

Up to this point we have only considered Nash implementation. There are a 
variety of reasons for considering other solution concepts, in particular subgame 
perfection. Indeed, some SCO’s that are not Nash implement able are subgame 
perfectly implement able. In addition, the use of subgame perfection rather than 
Nash equilibrium sometimes permits the use of simpler game forms and more 
compelling solutions. We now illustrate these possibilities. 

In this example we provide a nonmonotonic (and therefore non Nash imple- 
mentable; by Theorem (1.2.2) SCC that is subgame perfectly implement able. 
There are two agents 1 and 2, and two states s and t. There are three social 
choices, a, 6, and c. Agent 1 ranks the choices in both states in the following 
strictly decreasing order: 6, a, c. Agent 2’s ranking (again in strictly decreasing 
order) is a, c, h in state s, and a, 6, c in state t. The SCC is single valued 
and given by f{s) = a, and f{t) = b. It is straightforward to verify that / is 
not monotonic. However the following extensive game form subgame perfectly 
implements /. Agent 1 can immediately choose a (in which case the outcome 
of the game is a), or he can decline and give the move to agent 2. In the latter 
case 2 is informed of this and can choose the outcome of the game to be either 
h or c. It is easy to see that in state s the unique subgame perfect equilibrium 
outcome is a, while in state t it is b. Consequently, this extensive game form 
implements / as claimed. Note that it fails to Nash implement / (as it must) 
since in state t, a remains a Nash equilibrium outcome (agent 1 chooses a, and 
agent 2 if called upon to play chooses c). 

Finally we demonstrate that subgame perfection sometimes implements the 
desired SCC in a more compelling fashion than does the Nash equilibrium con- 
cept. For this we return to King Solomon’s dilemma but now we allow Solomon 
to exact fines. For concreteness, suppose that the true mother values the child 
at 20, the impostor values the child at 10 (the reader can choose the units as 
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he/she deems appropriate), and that Solomon can either fine both women 15 
or fine neither. Letting A denote the woman whose name is Ann and B Bess, 
the social choices are (A, 0), (B, 0), (A, 15), (B, 15), where the first component of 
each ordered pair denotes who gets the baby, and the second denotes the amount 
both women are fined. As before there are two states a and (5^ where Ann is 
the true mother in the first and Bess the true mother in the second. Solomon’s 
see, /, is given by /(a) = (A, 0) and /(/3) = (B,0). The preferences of Ann 
and Bess are summarized in the following table. Social choices in higher rows 
are strictly preferred to those in lower rows. 



Ann 


Bess 


State /?: Ann 


Bess 


(A,0) 


(5,0) 


(AO) 


(5,0) 


{A, 15) 


(A,0) 


(5,0) 


{B, 15) 


(5,0) 


(5,15) 


{A, 15) 


(AO) 


(5,15) 


{A, 15) 


(B,15) 


(A 15) 



Evidently, / is now monotonic. For instance, when the state switches from 
a to l3^ Bess’s preferences between (A, 0) and (B, 15) are reversed. Similarly, 
when the state switches from (3 to a, Ann’s preferences between (B,0) and 
(A, 15) are reversed. Consequently, / in this setting with fines may well be 
Neish implement able. Note that we cannot appeal to Theorem 1.2.2, since there 
are only two agents here. Nonetheless, / is Nash implement able by the following 
matrix game in which Ann chooses the row and Bess the column. 



(AO) 


(5, 15) 


{A, 15) 


(5,15) 


{A, 15) 


(5,0) 



In state a, the only Nash equilibrium is for Ann to choose the top row and 
Bess the leftmost column. In state /?, the only (pure strategy) Nash equilibrium 
is for Ann to choose the bottom row and Bess the rightmost column. Thus, the 
matrix game implements / in (pure strategy) Nash equilibrium. 

Note that in our whole discussion up to this point, we have said nothing 
about the agents’ preferences over lotteries over social choices. Consequently 
we have restricted attention up to now to pure strategy equilibria of the game 
forms introduced. The advantage of this approach is that it requires fewer 
assumptions about the agents’ preferences (in particular the von Neumann- 
Morgenstern axioms need not hold) and consequently (in our complete informa- 
tion setting) weaker assumptions about what each agent knows about the other 
agents’ preferences.^ The disadvantage is that the agents may well have von 
Neumann-Morgenstern preferences over lotteries and then there may be natu- 
ral mixed strategy equilibria that upset the implementation result^ The matrix 

am grateful to Motty Perry for pointing this out to me. 

^We will see that when we restrict preferences to the von Neumann-Morgenstern class, 
every social choice function is (virtually) implementable. This striking reult is due to Abreu 
and Matsushima (1992a). They also provide a nice discussion of a number of shortcomings of 
the previous implementation literature. 
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game just presented is a good example. Were Ann and Bess endowed with von 
Neumann-Morgenstern utility functions assigning in each state the number 4 to 
the top ranked choice, the number 3 to the next, down to 1 to the lowest ranked 
choice, there would in state /3 be an additional equilibrium in which Ann mixes 
between the two rows with equal probability, and Bess mixes between the left 
and right columns giving right weight 3/4. This equilibrium is not at all patho- 
logical and upsets the implementation result. This difficulty can be avoided 
by considering the remarkably simple extensive game form described below and 
employing subgame perfection rather than Nash equilibrium. 

The game proceeds as follows (see Glazer and Ma (1989)). Ann moves first 
and can choose either to give the baby to Bess, in which case the outcome is 
(B,0), or she can claim that the baby is hers. If she claims that the baby is 
hers, Bess is informed of this and it is Bess’ turn to move. Bess can then choose 
either to give the baby to Ann, in which case the outcome is (A, 0), or she can 
claim that the baby is hers in which case the outcome is (B, 15). It is easy to 
check that in state a the unique subgame perfect equilibrium outcome is (A, 0) 
and that in state j3 it is (B,0). Moreover, this remains the caise even if Ann 
and Bess have von Neumann-Morgenstern preferences and mixed strategies are 
considered. 

For general results on subgame perfect implementation see Moore and Re- 
pullo (1988) and Abreu and Sen (1990). We now turn to one of the most 
important recent developments in the theory of implementation. It is due to 
Abreu and Matsushima (1992a). 

1.4 Virtual Implementation 
1.4.1 The Setup 

In this section we shall not rule out the use of lotteries as we’ve done for the 
most part up to now. Moreover, we shall assume that the agents have von- 
Neumann-Morgenstern utilities over lotteries over the social choices in each 
state. Let A denote the set of lotteries over the given finite set of social choices. 
Thus A might as well be the unit simplex in 3?^, where the unit vector 
represents the lottery assigning probability one to the social choice. Let 
Uj(a, 5i) denote agent i's von Neumann-Morgenstern utility of lottery a in state 
s = (si,...,5n) G S C We shall restrict attention to single- valued 

SCO’s. Thus, / is a social choice function (SCF) mapping the set of states, 
S', into lotteries over social choices, A. Finally, we shall be concerned with 
implementation in iteratively strictly undominated (lU) strategies. This is a 
very weak solution concept in that any outcome supported by any refinement 
of Nash equilibrium survives iterative removal of strictly dominated strategies. 



^Moore (1991) generalizes this result to the cetse in which each woman knows the value the 
other places on the child, but Solomon only knows that the true mother’s value is the higher 
of the two. Perry and Reny (1994b) extend this further to the case in which the women’s 
values are purely private information but that each knows who has the higher value (i.e. each 
knows who the true mother is). The latter paper implements Solomon’s SCO in iteratively 
(weakly) undominated strategies. 
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But there is an advantage in this weakness. One need assume much less about 
the agents in order to conclude that they will play an lU strategy as compared 
to what must be assumed to conclude that they will play a Nash equilibrium. 
Thus from a practical point of view there is much to be gained from considering 
implementation in lU strategies. The real surprise is that lU implementation is 
not hopeless. 

A number of significant ideas will be brought together here to produce an 
extraordinarily permissive implementation result. The first of these is the idea 
that one ought to be content if one can implement the desired social choice 
function up to any prespecified degree of accuracy. 

Definition 1.4.1: The SCF f is virtually implem.entable in iteratively (strictly) 
undominated strategies if Ve > 0,3 an SCF g with ||.gr(5) — /(s)|| < e Vs G S 
such that g is implem,enta.ble in iteratively strictly undominated, strategies.^ 

We shall maintain the following cissumption throughout this section. 

Assumption A: For all agents z, and all pairs of distinct s^, G 5^, there is a 
pair of lotteries, a, 6 G A, such that zzi(a, Sj) > Ui{b, Si) and Ui{a^ U) < Ui{b^ U).^ 

Insight 1: Under Assumption A, every SCF is virtually monotonic in the follow- 
ing sense. For every SCF /, and every e > 0, there is a monotonic SCF g that 
is e-close to /. This is a consequence of the self-selection result below. Thus 
monotonicity, the necessary condition for Nash implementation, is automatically 
(virtually) satisfied when lotteries are available. 

Lemma 1.4.1 (Self-Selection): Under Assum.ption A, \/i, 3di : Si A such 
that Ui{di{si),Si) > Ui{di{ti),Si) , V d.istinct Si,ti G Si. 

Proof. For each agent z, let Ai denote the union of all lotteries a and b as 
in Assumption A when all distinct pairs 5^, ti are considered. Without loss of 
generality we may assume that Uj(-,Si) strictly orders Ai for all Si G Si. Let 
Pi,P 2 5 ‘•'’,'Pn^Ai be a strictly decreasing sequence of positive probabilities whose 
sum is one. For each define di{si) to be the lottery giving the ranked 
member of Ai (according to Si) probability pj. ■ 

To see that self-selection yields Insight 1 above, consider the social choice 
function 

g{s) = edi(si) 0 6 ^ 2 ( 52 ) 0 ... 0 edn{sn) 0 (1 - ne)f{s), (1) 



^ll-ll denotes Euclidean distance in Consequently, the definition requires that in every 
state, the probability that g assigns to any particular social choice is within e of that assigned 
by / . In this case, we’ll say that g is e— close to /. 

® Abreu and Matsushima (1992a) show that Assumption A is a consequence of the following 
two conditions: (i) no agent is indifferent over all of A in some state, and (ii) different states 
in Si for agent i induce different preferences over A for i. 
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where for lotteries a and 6, and p £ [0, 1], pa 0 (1 — p)b denotes the compound 
lottery in which a occurs with probability p and b with probability 1 — p. For 
€ small enough, g is arbitrarily close to /. Moreover, with the chosen as 

in the self-selection Lemma, g is easily seen to be monotonic. 

Given the SCF /, our object will be to implement the SCF g above in 
iteratively undominated strategies for e > 0 arbitrarily small. 

Insight 2: If e were large enough in (1), then we could implement g (which 
would then be very much like a random dictator SCF) in dominant strategies. 
Thus when the probability that each player is a dictator (namely e) looms large 
relative to the probability assigned to /, there is no problem. Of course we are 
concerned with the case in which e is small. An ingenious way of getting the 
right relative weights is to break the probability that / occurs into many {k say) 
small (relative to e) pieces. So write g as 

g{s) = edi(si)0ed2(52)0---®^dn(sn)0-^ — ~ — ^/(5)0...0-^ — - — ^/(s)(2) 

k k 

In order for this to have any real effect, the game form must be designed so that 
the agents can affect each of the last k outcomes of the lottery in g independently 
of one another. Once this is done, agent i' s incentive to change any one of the 
last k components of the lottery g{s) will be Outweighed by his incentive to 
obtain di[si) in the event he is chosen as dictator on Ai since the latter event is 
much more likely to occur than the former. The game form we shall construct 
below (due to Abreu and Matsushima (1992a)) takes full advantage of this idea. 
In order to simplify matters we make the following assumption.^ 

Assumption B: Small personal taxes can be levied. Moreover, agent i' s utility 
of lottery a together with a tax of r > 0 in state s*is Ui(a, Si) — r. 

1.4.2 The Game form 



Each agent i (simultaneously) sends a message consisting of /c0l cells. The first 
cell is a personal state Si £ Si. The remaining A: cells are each members of the 
state space S. Thus a typical message is of the form = (m^, ml, ...mf ), where 
G S'i, and ml ^ S for all j = 1,2, .../c. The joint message {mi, m2, ...mn) G 
X S^] determines the components ai, ...0^,61, ...bk G A of the lottery 



1 — nc , 1 — ne , 

eai 0 ... 0 eOn 0 — ;; h 0 ... 0 — b^ 



as follows: 



(3) 



ai = di{m°) 

^ _ ( f{s),iim.l=s, for all but perhaps at most one I = 1, ...n 

^ otherwise, 



^Abreu and Matsushima (1992a) do not assume this. Instead they assume that each agent 
can be punished (if only slightly) independently of other agents. 
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where a* is some previously specified lottery. Note that the last k cells of agent 
i's message independently affect the last k components of the lottery in (3), 
and the first cell determines the outcome in the event that agent i is given the 
opportunity to dictate over the elements of Ai. Recall Insight 2 above for the 
significance of this. Also note that if all agents report honestly in state 5 , then 
the outcome will be g{s). 

In addition to determining the outcome of the above lottery, the joint mes- 
sage determines the agents’ personal taxes. Before describing these, we first 
provide an appropriate choice of the parameter k to be employed in (3) as well 
as two other parameters, r and 6, where the first is used to compute the per- 
sonal taxes and the second provides a lower bound on the utility gain available 
from switching to an honest report from a false one when agent i is chosen to 
be dictator on A{. Specifically choose /c,r, and 6 all positive such that: 



Ui{di{si), Si) — Ui{di{ti), Si) ^ 6 for all distinct Si and U € Si (4) 

r <e6 (5) 



k > 



^(1 - ne) ^ ^ 

T 



( 6 ) 



where A = max[uj(a, Si) — Ui(a, Sj)], and where the maximum is taken over all 
agents z, personal states 5^, and lotteries a and a. Note that the Self- Selection 
Lemma ensures that (4) can be satisfied. 

There are two kinds of personal taxes that are potentially imposed, an outlier 
tax^ and a, false report, ta,x. These are determined by the joint message (mi, ...rUn) 
as follows. 



Outlier Tax: Let Tq denote the outlier tax. Set Tq = r jk. The outlier tax 
is levied on an agent whenever that agent is the only one to disagree with the 
reports of the others in a particular cell (beyond the zeroth) of his message. That 
is, agent i is taxed for every cell I = 1, ..., /c such that m\ = m^ = ^ m\. 

False Report Tax.: Let tf denote the false report tax. Set rp = r. The false 
report tax is levied at most once per agent when that agent in some cell of 
his message does not announce the (induced) state announced in the combined 
zeroth cells of all agents when all previous cells of all agents did so. That is, let 
j be the first cell such that m^, ^ (m.?, ...m^) G S for some agent i'. Then agent 
i is taxed rp if mf ^ (mj, ...,m.°). 

Each agents’ total tax is the sum of his false report tax and all his outlier taxes. 
This completes the description of the game form. 

1.4.3 The Result 



Theorem lA.2:The above gome form, implem,ents g defined by (1) in iteratively 
strictly undominated stro,tegies. 
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Proof. Suppose that the state is s = (si, ... 5 ^ 1 ). It suffices to show that for 
each agent i the only iteratively undominated message is (5^, 5, s). To do so 
we shall show that the first round of elimination of dominated messages implies 
that the first cell of agent i's message must be and that the second and 
subsequent rounds imply that the second and subsequent cells must be s. 

First Round: Suppose that the first cell in i's message is not Sj. Then the gain 
from switching only the first cell of his message to Si is at least e6 (in expected 
utility) by (4) . The loss in making such a switch is at most Tp since the outlier 
tax levied on i is unaffected by the switch. Consequently, the net gain of the 
switch is at least eS — tf, which is positive by definition of tf and (5). Thus all 
messages not involving honest reporting in the first cell are removed in the first 
round of elimination. 

Second Round: From the first round all agents’ first cells are honest reports 
of their personal state. Thus, = s. Suppose that some agent i's 

second cell is not equal to s. 

Case A. Consider the case in which some other agent’s second cell is also not 
equal to 5. If agent i switches only the second cell of his message to s, then he 
gains Tf in utility since he no longer pays the false report tax. This switch may 
affect the lottery that enters in the place of bi in (3) and it may also result in 
i paying an additional outlier tax (since it may be that all other agents’ second 
cells are equal and distinct from s). Thus i's loss (in expected utility)from 
making such a switch is at most Tq -h A. Consequently, i's net gain from 
this switch is at least tf — {tq -h A) = ~ A, which is positive by 

( 6 ). 

Case B. Consider the alternative case in which all other agents report s in the 
second cell of their message. The gain to agent i from switching the second cell 
of his message to s is now Tq + , since both the outlier and false report taxes 

are avoided for this cell. Since all other agents’ second cells matched, i's second 
cell report has no impact on the lottery 61 in (3). Hence the only potential loss 
in making the switch is that i might incur the false report tax for a different 
cell. Thus i' s net gain from the switch is at least which is positive. 

Hence after the first round of elimination, it is strictly dominant for each 
agent to report s in the second cell of his message. The argument for the second 
cell can be repeated to conclude that for each agent i the only message surviving 
iterative elimination of strictly dominated strategies is (s^, s, ..., s). ■ 

Two features of the game form are well worth noting: (i) it involves no 
integer game, and (ii) mixed strategies are not ruled out of the analysis. It 
is remarkable what has been achieved here. Not only is the implementation 
result completely permissive (i.e. all SCF’s are virtually lU implementable) , 
but the equilibrium concept is extraordinarily weak. Indeed it is enough that 
expected utility maximizing behavior be common knowledge among the players 
in order that only iteratively undominated strategies survive. Moreover, two 
fundamental difficulties inherent in previous implementation results (namely 
the use of integer games and the ban on consideration of mixed equilibria) have 
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been entirely overcome. It should also be noted that the total taxes levied can 
be made arbitrarily small both on and off the equilibrium path. Finally, we 
remark that the full force of the expected utility hypothesis is not needed. In 
addition to continuity, the only required property is the following: Suppose that 
the probabilities assigned to two outcomes of a compound lottery are switched. 
If the newly created compound lottery now gives higher weight to the (strictly) 
more preferable of the two outcomes, then the new compound lottery is strictly 
preferred to the old. 

In our view, the result due to Abreu and Matsushima (1992a) reviewed here 
necessitates a fundamental change in the direction of research in implemen- 
tation theory. Their result has more or less completely settled the question 
”What SCF’s can be implemented?” from a theoretical standpoint. In prac- 
tice, the number of rounds of elimination required in the Abreu-Matsushima 
game form depend upon how closely one wishes to approximate the original 
SCF, The number of rounds increases (without bound) as the approximation 
improves. If one is uneasy about long chains of iterative elimination being made 
by individuals in practice, then one might well hesitate to actually employ the 
Abreu-Matsushima game form. See Glazer and Rosenthal (1992) for further 
comments on this issue as well as the reply in Abreu and Matsushima (1992b). 
Thus there remains an important avenue for research, namely the search for 
’’simple” and ’’compelling” game forms that are capable (in practice) of imple- 
menting ’’interesting” and ’’important” SCF’s. 

2. The Core 
2.1 Introduction 

There is little doubt that the core is a central idea both in game theory and 
economics. In the sequel I shall focus on two questions. 

1. Under what conditions can a planner ensure any (and only) core outcomes? 

2. Are core outcomes inevitable? 

The second question is the familiar one of Edgeworth (1881) and it lies at the 
heart of the positive (as opposed to the normative) interpretation of the core. 
Put somewhat differently, ’’Are outcomes not in the core necessarily unstable?” 

We will begin by focussing our attention on the first question. To the extent 
that the core has normative appeal, this question is clearly of interest. In 
the language of the previous sections it asks whether or not the core can be 
implemented. It is straightforward to show that the core correspondence, viewed 
as a see, is monotonic. Moreover on economic domains it satisfies no veto 
power.® Consequently, by Theorem 1.2.2 the core can be Nash implemented. 



®In an exchange economy setting in which all commodities are desirable, no veto power 
is vacuously satisfied since the allocation top ranked for a consumer is that which gives him 
everything and hence it cannot be top ranked for anyone else. 
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Since our concerns at this point are largely normative, it is rather important 
to have confidence that the game form employed will actually perform in practice 
as it does according to the theory. With this in mind, it makes sense to search for 
game forms having rather simple message spaces. The message space employed 
in the proof of Maskin’s Theorem is potentially complex. Indeed it is the entire 
state space and more. Thus in particular, each agent is asked to report the 
preferences of all other agents over the entire set of social choices. In specific 
cases it is sometimes possible to simplify the message space substantially. Again 
we emphasize that this has at most practical, not theoretical, consequences. We 
now present a game form implementing the core in subgame perfect equilibrium 
with rather simple message spaces. The presentation is based on Serrano and 
Vohra (1993). 

2.2 An Implementation of the Core 



Let {v, N) be a TU game. Thus, A is a finite (with cardinality n) set of players, 
and i;(-) is a function mapping nonempty subsets of N (coalitions) into non- 
negative numbers. For a payoff vector x G x{S) denotes the sum of the 
components X{ of x for z G 5. A feasible proposal is a pair (x, S) where x G 3?’^, 
S C N and x{S) < v{S). A payoff vector x is in the core of {v, N) if x{S) > v{S) 
for every 5 C A, with equality for S = N. 

Of course, the planner, who wishes to implement the core of does 

not know the characteristic function v. Otherwise he would simply impose an 
outcome in the core. We shall however make the following aissumptions about 
the planner. 

1. The planner can extract fines (measured in payoff units) from each player. 

2. The planner can verify the feasibility of any payoff vector. 

3. The planner can impose a coalition structure. 

The content of (1) beyond the obvious is that the planner is assumed to 
know a lower bound on the resources of each agent. Without this, an agent 
might claim to have no resources in order to avoid paying a fine. Assumption 
(2) says that for each coalition 5 C A, and any payoff vector x, the planner 
can verify whether or not x(5) can be attained by the members of S. This does 
not require the planner to know v{S).^ The last assumption is self explanatory. 
We now describe a two stage game implementing the core of (t>, N) in subgame 
perfect equilibrium. 



^For instance, in an exchange economy setting the planner may know the players’ prefer- 
ences but be unaware of their endowments. The planner can nonetheless verify the feasiblity 
of any payoff vector by simply requiring the players to display the required amounts of com- 
modities that they claim to have. On the other hand the planner may not know the players’ 
preferences but may know their endowments. Although the utility vector now cannot be ver- 
ified, the planner can verify the feasibility of any allocation for any coalition. The mechanism 
below can be easily modified to implement the core in this (perhaps more natural) case as 
well. 
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Stage 1: All players i simultaneously announce a feasible proposal of the form 
(x% A")/® and a natural number 

1. If X*' ^ for some i and then player %' = m/c mod(n) is fined. 

The coalition structure in which all players are alone is then imposed. 

2. li x'' = X for all i G then all players are informed of all the announce- 
ments (x^TTii) and the game proceeds to Stage 2. 

Stage 2: Player i' = J2keN niod(n) from Stage 1 makes a feasible proposal 
All members of S are informed of this proposal and respond to it (by 
saying ’’accept” or ’’reject”) sequentially in some predetermined order. If all 
members of S accept then the game ends with i £ S obtaining and i ^ S 
obtaining v{i). If at leaist one member of S rejects the proposal, then the grand 
coalition is imposed and each i £ N obtains the status quo from Stage 1. 

Proposition 2.2.1: The above game form, implem,ents the core of (v^N) in 
pure strategy suhgame perfect equilibrium,. 

Proof. We leave it to the reader to verify that any core payoff x can be 

supported as an SPE. The equilibrium path has all players announcing the 
payoff vector x and some natural number in stage one; the player chosen as 
proposer in stage two proposing (x, A/"); and all players accept. 

To prove the converse, we suppose that z is a pure strategy SPE outcome. 
STEP 1: Each players’ first stage payoff vector announcement must be the same. 
Otherwise player i' could change his announced number and avoid the fine and 
increcise his payoff. Let this common stage 1 payoff announcement be x. 

STEP 2: Each player i can, by changing his announced number, become the 
proposer in stage 2 and there propose (x^N). Subsequently, regardless of how 
the others respond, the outcome will be x. Since the outcome is in fact z, this 
implies that Zi > Xi for all players i. 

STEP 3: Suppose by way of contradiction that z is not in the core of (v,N). 
Then for some payoff vector y and coalition S with y{S) < v{S), we have yi > Zi 
for alH E S'. Choose j £ S and consider a deviation by j rendering j the pro- 
poser in stage 2 (the status quo then remains x) and in which j proposes (y, S). 
Clearly subgame perfection demands that every i £ S accept this proposal since 
Vi > Zi> Xi for alH E S. Hence j can profitably deviate, a contradiction. ■ 

Remark 2.2.1: The restriction to pure strategies is essential. The presence 
of the ’’modulo” game in stage 1 creates mixed strategy equilibria which would 
upset the result were they admissible. This is a serious drawback of this mech- 
anism. 



^®The feasibility constraint can be enforced by the planner since he can verify fe2Lsbility. 
Again, feasibility is enforced by the planner. 
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Remark 2.2.2: No refinement beyond (pure strategy) subgame perfection is 
required. This is in contrast to the results to follow which impose stationarity 
as well. 

Remark 2.2.3: The Proposition implies that the existence of a pure strategy 
SPE relies on the nonemptiness of the core of the game. Thus if (and only if) 
the game is balanced a pure strategy SPE exists. This is in contrast to the 
model of Perry and Reny (1994a) in which total balancedness is required. 
Remark 2.2.4: Finding a simple, compelling game form that implements the 
core without excluding mixed strategies remains an open problem. 

2.3 A Noncooperative Approach To The Core 

2.3.1 A Canonical Discrete-Time Coalitional Bargaining Model 

We now consider our second question ’’Are core outcomes inevitable?”. Some- 
what more precisely, must the outcome of a setting in which agents can interact 
in an unfettered manner be in the core? Note that the motivation here is quite 
different from that of the previous section. There is no planner here and the core 
is not viewed as a desirable goal that somehow must be attained. Rather, the 
focus is on identifying those settings in which the core happens to be attained. 
The significance of the results should then be judged in terms of the naturalness 
of the settings that are identified. 

We shall focus on one particular setting whose underlying opportunities are 
summarized by the TU game {v^N). The following extensive form game with 
perfect information attempts to capture the idea that the players can interact 
in an ’’unfettered manner.” 

For each coalition S' C A, let (f){S) denote an ordering of the members of 
S. Call 4> a protocol. The game proceeds as follows. Calendar time, indexed 
by t, is initialized to zero. Players discount the future at the common discount 
rate 6 G (0, 1]. The first player according to (j){N) is chosen as the proposer. He 
makes a feasible proposal (x,S). The members of S respond (accept or reject) 
in order according to If z G S is the first to reject the proposal, then 

calendar time moves ahead one unit, i becomes the proposer and the above 
process repeats. If all z G S' accept (x,S), then each i E S receives payoff 
6^Xi and leaves the game. Calendar time does not move ahead. The game now 
proceeds as at the start, with the remaining players taking the role of N. Thus, 
if T denotes the set of remaining players, the first player according to 4>{T) is 
chosen to be the proposer, etc. Call this the coalitional bargaining game. 

This model (with 6 = 1) is due essentially to Selten (1981). A similar game 
is employed by Moldovanu and Winter (1991) and Chaterjee, Dutta, Ray and 
Sengupta (1993). The last paper assumes strict discounting (i.e. (5 < 1). Note 
that the discounting case of Rubinstein’s (1982) bargaining model is a special 
case of the above game. 

Our aim is to consider the subgame perfect equilibria of this game with a 
view to supporting only core (and all core) outcomes. A number of concerns 
are immediately apparent. 
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Stationarity: Without discounting many non core outcomes can be supported 
as SPE. Indeed, even with discounting there is a folk-theorem due to Shaked-like 
strategies (see Osborne and Rubinstein (1990) chapter 3). Hence, a refinement 
beyond subgame perfection is necessary. We’ll therefore consider stationary 
SPE. These are subgame perfect equilibrium strategies in the usual sense that 
also satisfy the following stationarity property: players’ actions depend only 
upon the set of remaining players, and the current proposal. 

Discounting: Since Rubinstein’s (1982) model is a special case of the coali- 
tional bargaining game, the latter cannot always support all core outcomes. For 
example, in Rubinstein’s (1982) model, any division of the pie is in the core, 
yet only a single division is subgame perfect. Thus, if we insist on including 
discounting, there is no hope in supporting all core outcomes. On the other 
hand, we may be content so long as those outcomes that can be supported are 
in the core. The following example (due to Chaterjee et. al. (1993)) shows that 
the presence of discounting may preclude even this. 

Example 1. v{N) = 1 -h /x, x;({l,2}) = u({l,3}) = 1, i’({2,3}) = e > 0, where 
e is small and 0 < /x < 1/2. For 6 > every stationary equilibrium has the 
common features described in the following table. 



Players Remaining 


Reservation Payoff 


{1,2,3} 


b 

14-6 


{1,2} or {1,3} 


6 

14-6 


{2,3} 


e6 

1+6 


{!}, {2}, or {3} 


0 



The second row of the table, for instance, indicates that if the set of remaining 
players is {1,2} (or {1,3}), then both players 1 and 2 (resp. 3) accept any 
proposal giving them a payoff of at least They reject all other proposals. 

Consequently, the grand coalition does not form. Although as we shall see 
below the protocol can play a role in determining whether or not the outcome 
is efficient, it plays no such role here. The inefficiency remains regardless of 
the protocol. It is the presence of discounting that accounts for the inefficiency 
here. 

Player Order: The following examples indicate the important effect that the 
(essentially arbitrary) protocol, 0, has on the outcome of the game. The first is 
taken from Chaterjee et. al. (1993). 



This is a rather strong stationarity assumption. It is employed by Chaterjee et. al. (1993). 
Moldovanu and Winter (1991) also allow the accept /reject decision to depend upon the set 
of players who have so far accepted the current proposal. We adopt the stronger version here 
only so that we can provide a single analysis that is consistent with both papers. 
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Example 2. v{N) = 1.3, ?;({1.2}) = 1, ?;({!, 3}) = t>({2,3}) = .1, and v{i) = 0 
for i = 1, 2, 3. For 6 € (3/7, 1), and for each protocol, there is a unique stationary 
SPE. Although the equilibria may differ with different protocols, they each have 
the following in common. For i G {1,2}, whenever it is i's turn to make a 
proposal i proposes (( (1, 2}). Whenever it is player 3’s turn to make 
a proposal, 3 proposes 1-3 — (1, 2, 3}). Consequently, if the 

protocol has player 1 or 2 going first, the outcome will be inefficient. 

One might suspect that the protocol alone cannot account for inefficiency 
and that discounting is in fact always the culprit. The next example, in which 
there is no discounting at all shows this to be incorrect. 

Example 3. V{N) = 1, ^({1,2}) = ^({1,3}) = 1, ^^({2,3}) = v{{i}) = 0, for 
i = 1, 2, 3. The following table describes stationary strategies which constitute 
an SPE for ^ = 1. 



Players Remaining 


Reservation Payoff 


Proposal 


{1,2,3} 


0 for z = 1, 2 


((l,0,0),iV) for i = 1 


{1,2,3} 


1 for 2 = 3 


((0,0), {2, 3}) forz = 2,3 


{1,2} 


1 

2 


((i^),{l,2}) 


{1,3} 


1 

S> 


((1,1), {1,3}) 


{2,3},{1},{2},{3} 


0 


offer each remaining player zero 



The first row of the table indicates that if all players remain (first column), 
then players 1 and 2 accept all nonnegative offers (column 2), and when chosen 
as proposer player 1 would propose that the grand coalition form and that he 
get the entire surplus (column 3). It is straightforward to check that these 
strategies are stationary and that they form an SPE, regardless of the protocol. 
In addition it is clear that the outcome is (0,0,0) if the protocol is such that 
either player 2 or 3 is chosen as the first proposer. On the other hand, if player 
3 happens to be the first proposer, then the outcome is the unique core point. 

So, when the protocol affects the outcome, the outcome may not even be 
efficient let alone in the core. But when the protocol does not affect the outcome, 
it turns out that the outcome must be in the core as we now show. The following 
is based on Moldovanu and Winter (1991). 

Definition 2.3.1: Call a sta,tiona.ry SPE order independent if it remains an 
SPE and, ind.uces the sa,me outcom,e for all protocols. 

Proposition 2.3.1 (Moldovanu and Winter (1991)): Suppose that {v^N) is 
totally bo, la, need. Then x is in the core of (v^N) iff x ca,n be supported by an 
order independent SPE of the coalitional bo,rgo,ining ga,m,e with 6=1. 
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Proof. Since {v,N) is totally balanced we may choose for each 5 C A/", a 

vector that is in the core of {v^^S), where is the restriction of v 

to S. If X is in the core, then consider the following strategies. If the set of 
remaining players is 5 : 

(i) the proposer proposes (x‘^,5), 

(ii) a proposal (y,T) is accepted by i G T on his turn if and only if yi > xf . 

It is straightforward to check that these strategies form a stationary, order 
independent SPE. 

Conversely, suppose that x is supported by a stationary order independent 
SPE, a. Assume by way of contradiction, that x is not in the core. Hence 
for some y and S with y{S) < v[S)^ we have yi > Xi for all i G S. Choose a 
particular i G S and consider the protocol in which i is the first proposer. By 
order independence, x remains a stationary SPE outcome under a. Consider, 
however, the deviation in which i proposes (y^S). If some j G S rejects the 
proposal, then j becomes the proposer. But since a is order independent, the 
resulting continuation outcome will also be x. Consequently, every member of S 
strictly prefers that (y, S) be accepted than that some member of S (including 
himself) reject it. Hence subgame perfection demands that (t/, S) be accepted by 
every member of S. But this deviation by i is then profitable, a contradiction. H 

It is evident from the examples that the presence of discounting in the coali- 
tional bargaining model can preclude the existence of stationary SPE outcomes 
that are in the core. Also evident is that the order in which players appear can 
be critical. In example 3 for instance, when player 1 is not the proposer, player 
1 has no opportunity to respond to the proposal made by 2 or 3. In particular, 
player 1 does not have an opportunity to suggest a proposal that all players 
would prefer to (0,0,0). In short, player 1 is not given an opportunity to block 
the eventual non core outcome (0,0,0). But even our standard classroom story 
behind the core relies on the ability of all players to block a non core outcome. 
Thus, its no surprise that the coalitional bargaining model runs into trouble. 

One might try to modify the coalitional bargaining model in the following 
way so that all players have an opportunity to block non core proposals: after 
a proposal is made, all remaining players must accept the proposal in order 
for it to go into effect, even those players who are outside the coalition whose 
payoffs are under consideration. Unfortunately, such a modification then allows 
all efficient outcomes to be supported, not just core outcomes. 

There is however a way out of the ’’player order” difficulty. It must first be 
recognized that the problems we’ve encountered are artificial in the sense that 
they are artifacts of the discrete-time game theoretic model we have chosen to 
employ. If we were to imagine a real life setting in which people are gathered to 
trade and negotiate and there are ”no rules” governing the negotiating proce- 
dure, then a priori there would be no prespecified ordering of the players. There 
would be proposals and counterproposals with players jumping in and out of the 
negotiations as they saw fit when they saw fit. This sort of free form bargaining 
can be captured by employing a continuous- (rather than discrete) time model. 
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As we shall see, giving the players the opportunity to strategically time their 
proposals adds just the kind of freedom needed to ensure core outcomes. 

2.3.2 Continuous Time 

The following is based on Perry and Reny (1994a). We will not present here 
the formal details of the continuous-time model. A number of (well known) 
subtleties must be addressed in order to ensure that such a model is well defined. 
Perry and Reny (1994a) contains the details. Rather, we shall be content to 
outline the model and discuss the results. The similarity to the coalitional 
bargaining model of the previous section should be noted. The continuous-time 
model has the following features: 

1. Any player i can make a feasible proposal (x, S) at any time t > 0 (i need 
not be in S). 

2. There is at most one active proposal (the most recent one) at any time. 

3. An active proposal (a:, S) can be accepted by members of S at any time 
(strictly) after it is made. Once all members of S accept it (including the 
proposer i if i G S'), we say that the proposal has been accepted. 

4. Once accepted, (x, S) becomes a binding proposal. Members of S needn’t 
leave and consume, although any member i oi S may at any subsequent 
time choose to leave and receive payoff Xi (there is no discounting) forcing 
every member j of S to leave at the same time and receive payoff Xj . 

5. A proposal is rendered inactive once it is accepted, or once it is replaced 
by a new proposal made by some remaining player. 

6. If (x, S) is binding, then any proposal made to some member of S must 
be made to every member of S. 

A strategy is a mapping from histories of play into actions, where an action 
is one of the folowing: make a proposal, accept a proposal, be quiet, or leave. 
The following restrictions are placed on strategies. 

(i) Once a player has accepted a proposal he must remain quiet until the proposal 
is rendered inactive. 

(ii) At any time t a player must be quiet a little before t and a little after t. 

Restriction (i) eliminates the possibility of unwinnable races arising. For 
instance if i makes an irrational proposal to j, and i accepts it, then (if he 
could) i would like to render the proposal inactive before j accepts it and j 
would like to accept it before it is rendered inactive. In continuous time such a 
race to act first, yet after, a given time cannot be won. Thus without (i), after 
such a history no SPE would exist and consequently no SPE exists at all. 
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Restriction (ii) plays a dual role. It ensures that the continuous-time game 
is well defined and in this capacity its role is merely technical. In addition 
however, it serves a more significant purpose. The time during which all players 
are quiet although potentially arbitrarily small is time enough for one of them 
to make a proposal which blocks any non core outcome. 

Definition 2.3.2: A strategy is station, a.ry if it depends only on the set of 
remaining players; the active proposal; those who ho.ve accepted, it; the set of 
binding proposals; and. the a.m,ount of tim,e that the (currently) active proposal 
has been active, or the a.mount of time that has elapsed without a.n active proposal 
if currently there isnt one. 

Proposition 2.3.2: If x is a sta.tionary SPE outcom.e, then x is in the core of 
{v,N). 



Proof. Suppose not. Then there is a feasible proposal (y,S) such that 
Pi > Xi for all i G S. Let some particular i G S propose {y,S) at some time 
t such that according to their strategies all players are quiet at t and no one 
has accepted a proposal before time t. Such a time exists by restriction (ii) 
above. It suffices to show that all members of S must accept this proposal 
before it is rendered inactive, since this would then be a profitable deviation by 
i contradicting the equilibrium hypothesis. To see that all must accept, suppose 
that all but one member of 5, player k say, have accepted. If according to 
the equilibrium strategies the proposal becomes inactive before k accepts it, it 
must be replaced by another proposal (z,T). Suppose that the continuation 
then leads to the payoff Wk for player k. Since k could have accepted {y, S) but 
did not in equilibrium, it must be the case that Wk > yk > But player k 
could then propose (z,T) near enough to time zero to lead to a payoff for him 
(by stationarity) of Wk > Xk- But this contradicts the hypothesis that x is an 
equilibrium outcome. Consequently, if all but one member of S accepts (y, S) 
before it is rendered inactive, so will the last member. Using this, the same 
argument can be applied if all but two members of S have accepted {y,S). This 
can be carried all the way back to conclude that every member of S must accept 
(y, S) before it is rendered inactive. ■ 

Remark 2.3.1: Note how the line of proof mimics the usual classroom story 
motivating the core. 

Remark 2. .3.2: The Proposition implies that total balancedness of {v,N) 
is a necessary condition for the existence of a stationary SPE. For consider a 
subgame in which the set of remaining players is S. The Proposition applies 
equally well to this subgame considered as a game in its own right. Consequently, 
a stationary SPE must induce a stationary SPE on this subgame and by the 
Proposition the outcome on this subgame must be in the core of the TU game 
restricted to S. Thus, for all S C N, the core of the TU game restricted to S 
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must be nonempty. Therefore (v^N) must be totally balanced. The next result 
shows that total balancedness is sufficient for the existence of a stationary SPE 
as well. 

Proposition 2.3.3: If (?;, N) is totally balanced, and. x is in its core, then x 
can be supported as a sta.tiona.ry SPE. 

A proof can be found in Perry and Reny (1994a). We give here only a sketch. 

Proof. (sketch) The main concern in constructing an equilibrium is to ensure 
that after any history there is an appropriate continuation with the following 
properties: 

1. Those with binding proposals must obtain payoffs that are no lower than 
that associated with their binding proposal. 

2. If a proposal (x, 5) is active, then (in equilibrium) acceptance of it does 
not reduce the payoff of those players not in 5. 

3. The subsequent outcome cannot be blocked (in the usual sense of blocking 
coalitions) by ’’admissible” (to be explained below) coalitions. 

Property (1) is obviously necessary otherwise those with binding proposals 
would deviate by leaving. Property (2) is needed to ensure that unwinnable races 
do not arise. If members of S wish to accept (x, S) while members not in S wish 
to preclude this, an unwinnable race will arise and existence of an equilibrium 
will be thwarted. This brings us to property (3). Note that the discussion in 
Remark (2.3.2) establishes that the outcome in any subgame cannot be blocked 
by any proposal that can be made in the continuation. When there are no 
binding proposals among the players that remain, this means that the outcome 
must be in the core of the TU game restricted to the remaining players. However, 
when there are binding proposals present, the set of allowable proposals in the 
continuation is restricted by (6) of the description of the game. Thus a coalition 
is admissible if whenever it contains a member who is a part of a binding proposal 
(x, 5), it contains all members of S. 

With the above three properties in mind, we provide the essential ingredient 
for constructing strategies satisfying them. For each T C N, let x^ denote 
an element of the core of the TU game restricted to T. Consider any history 
of play. Let H = {{y^ , {y'^ , S'^)} denote the set of binding proposals, 

and let the set of remaining players be S. For each player i G S, consider the 
following function: 



^^Note that by (6) of the description of the game, each player is a part of at most one 
binding proposal. 
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Zi{U,S) 



( yk ^ 



|S'=| 






, if f G 5*= 

, if ieS\ 5'= 



An equilibrium continuation can now be constructed based on the vector- 
valued function 2:(I1,5). The idea behind the construction is that 2:(II,5) is 
the equilibrium outcome after any history of play in which no remaining player 
(among S) has accepted the active proposal or in which there is no active pro- 
posal. We are content here to check that this ensures properties (l)-(3). 

Property (1): It suffices to show that if i G 5^ for some k then — 

y^) > 0. Now since {y^,S^) is a feasible proposal, yf < v{i^). In addi- 
tion, because is in the core of (v^S) and C • These 

two inequalities yield the result. 

Property (2): If there is no active proposal, then i's payoff will be ^^(n,^). 
If there is an active proposal (y, T) and it is accepted, it becomes binding and 
is added to the list of binding proposals. Thus II becomes II' say. However, for 
all i ^ T, it is easy to check using the definition that 2 :^(n', S) = ^i(H, S). 
Property (3): It suffices to show that z(H, S) cannot be blocked by any coali- 
tion T such that for each k either T contains or T is disjoint from S^. 
So, consider such a T and suppose that {w^T) is a feasible proposal. Then 
^i(n, 5) = XliGT^f - '^(T) > where the equality follows by 

definition of the first inequality follows since is in the core of {v^ S) and 
T C 5, and the second inequality follows from the feasibility of (w,T). Conse- 
quently T cannot block 2 ;(H, 5). ■ 



Remark 2.3.3: With an additional restriction on strategies Propositions 2.3.2 
and 2.3.3 can be extended to the NTU caise. The additional restriction is that 
only those to whom a proposal is directed can speak. All others must remain 
quiet until the proposal becomes inactive. 

Remark 2.3.4: Although adding discounting to the discrete-time model gives 
the core no chance, discounting can be added to the continuous-time model 
without affecting the results in any significant manner. The only adjustment 
that must be made is that one must look instead at 6— perfect equilibria rather 
than exact equilibria. Perry and Reny (1994a) contains the details. 



So what have we gained from all of this? The hope is to have obtained 
some insights regarding those circumstances in which core outcomes are more 
likely to arise. The analysis of this section suggests that the players must employ 
relatively simple strategies that have a stationary structure, and that each player 
must have the opportunity to intervene (by making a proposal) before any player 
or group of players decides to leave. To the extent that these properties are 
present in settings in which the players can interact in an ” unfettered” manner, 
we have then gone some way toward shedding light on the answer to Edgeworth’s 
question: ” Are core outcomes inevitable?” . 
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Abstract. This paper surveys implementation theory when players have in- 
complete or asymmetric information, especially in economic environments. 
After the basic problem is introduced, the theory of implementation is sum- 
marized. Some coalitional considerations for implementation problems are 
discussed. For economies with asymmetric information, cooperative games 
based on incentive compatibility constraints or Bayesian incentive compati- 
ble mechanisms are derived and examined. 
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1 Introduction 

In this paper, we examine part of the literature regarding implementation un- 
der incomplete or asymmetric information. Implementation includes not only 
the classical social choice problem of characterizing the functions or sets of al- 
locations or outcomes that can be obtained as the result of a group decision — 
in this case, with truthful Bayes-Nash equilibrium in a direct mechanism — but 
also the extension of these group decisions to encompass cooperative game- 
theoretic solution concepts (rather than exclusively noncooperative equilib- 
ria) and the inclusion of incentive compatibility or mechanism considerations 
into cooperative games derived from economies with asymmetric or incom- 
plete information. 

After introducing the basic incentive compatibility problem, we proceed 
to examine implementation theory proper when there is incomplete informa- 
tion, with particular emphasis on Jackson’s (1991) necessary and sufficient 
conditions for Bayes-Nctsh implementation. After some remarks concerning 
the possibilities for considering coalitional behavior in the implementation 
problem with incomplete information, we redirect our attention to economies 
with incentive compatibility constraints and the games they generate. Finally, 
we briefly consider a game-theoretic model of how agents could cooperatively 
select a Bayesian incentive compatible mechanism. 
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2 The Basic Problem 



Consider a simple prototype implementation problem with asymmetric infor- 
mation. Suppose that the uncertainty is summarized by a set 5 = {si, S2, S3} 
of states of the world or signals correlated with states. Let A be a set of 
possible actions. The problem is to pick a mapping from states to actions 
optimally. 

Suppose further that there are two individuals, one of whom (distin- 
guished by the subscript I for “informed”) knows the state s E S while 
the other (denoted by the subscript U for “uninformed” ) does not know any- 
thing about which state s e S has occurred, although the set 5 , an objective 
probability on it, the utility functions, and the information structure are all 
common knowledge. The preferences of these two individuals are given by 
state-dependent utilities uj : A x S M and uu • A x S M. 

Incentive compatibility here means that a : S' A satisfies u/(a(s); s) > 
u/(a(s');s) for all s,s' G S. This is also sometimes termed the self-selection 
constraint. It means that the informed agent is willing to reveal truthfully 
the state of the world, because in all states s G S, the utility of a{s) given s is 
never less than the utility the informed agent could obtain by stating s' £ S 
and thus receiving a{s') when the true state is 5 G S. There is no incentive 
compatibility constraint for the second agent because his lack of information 
is common knowledge. 

For the special case in which A C M and uj is strictly monotone on A 
for each 5 G 5 , incentive compatibility requires a{si) = 0(^2) = ^{^3]' The 
informed agent cannot be forced to reveal information truthfully if doing so 
would lead to this agent receiving less “money” than he or she could obtain 
in some other state of the world. 

Changing the model to give the uninformed agent partial information so 
that he can distinguish {si} from the event {S25S3} can alter the results. 
However, whether this partial information is verifiable — whether it can be 
confirmed by some third party who can act as a referee if necessary — matters 
greatly. If the information is verifiable, the partially uninformed agent can 
force an allocation in state 5i which may be either better or worse for the fully 
informed agent than what the fully informed agent received in S2 or 53 [which 
still must satisfy a{s2) = 0,(33) in the strictly monotone one-dimensional 
example], and similarly for the informed agent. If {51} versus {52,53} is 
not verifiable, then we must add incentive compatibility constraints for the 
partially uninformed agent in order to force him to reveal correctly whether 
he believes that the true state lies in {51} or {52,53}. 
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For convenience, one sometimes imposes excess incentive compatibility 
when there is asymmetric information. Doing so may decrease welfare, but 
it sometimes doesn’t change the qualitative properties of the solution. Obvi- 
ously, the advantage is to simplify notation by, for instance, treating informed 
and uninformed agents symmetrically by giving them all the same incentive 
compatibility constraint that properly applies only to the informed agents 
(when their identities are common knowledge). This procedure may be correct 
if the less informed agents cannot be distinguished and if all agents are per- 
mitted to announce the state of the world; in this case, the uninformed agents 
could always, for instance, announce the state s if uu{a{s); s) > uu{d{s'); s) 
for all s, s' £ S. 

In the terminology of noncooperative game theory, incentive compatibility 
says that telling the truth is a Nash equilibrium in the game with strategies 
consisting of announcements about states of the world and payoffs defined by 
utilities evaluated at the proposed a \ S A mapping for the true state of the 
world. Implementation basically means that an allocation can be obtained as 
a truth- telling Nash equilibrium; this idea will be made more precise later. 

An introduction to this literature can be found in d’Aspremont and 
Gerard- Varet (1979, 1982), Myerson (1991), and Postlewaite and Schmeidler 
(1987). Note, however, that I shall not attempt to give a complete reference 
list or even a historical summary of this topic. 



3 Implementation with Asymmetric Information 

A fundamental result in implementation theory is the revelation principle, 
which roughly states that anything which is incentive compatible (and hence 
implement able) can be implemented as a truth-telling (Nash) equilibrium of 
a direct mechanism, where a direct mechanism is a noncooperative game in 
which players’ strategies consist of complete announcements of what they 
know about their “type” (i.e., preferences). The extension to incomplete in- 
formation frameworks is due to Rosenthal (1978), Myerson (1979), and Harris 
and Townsend (1981); for a discussion, see the textbook by Myerson (1991, 
pp. 260-261) or the survey paper by Postlewaite and Schmeidler (1987). 

Bayes-Nash Revelation Principle With incomplete information, if an 
allocation function can be obtained as a Bayes-Nash equilibrium (of some 
mechanism or some communication game), then it can be implemented with 
truthful equilibrium strategies in a direct mechanism. 

An important insight is the importance of an informational condition, 
known as publicly predictable information (PPI) or nonexclusivity of informa- 
tion (NEI). The assumption states that no player has information which is not 
at least as coarse as the pooled information of all other players. In symbols, if 
we let the sub-cr-field Gi denote the information of player i £ N, where all of 
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the Qi are sub-a-fields of a given or-field T of measurable events, publicly pre- 
dictable information precisely requires that for all i E N, Qi C cr( . 

This means that all of the other players, acting together, can always detect 
lies by any individual. The significance of publicly predictable information 
is that it permits the use of “forcing contracts” or mechanisms in which 
an extremely bad outcome arises whenever a single player tells a lie. If the 
messages sent by players are inconsistent, the mechanism assigns the worst 
possible outcome so that any unilateral lie looks extremely risky; hence, truth 
must be a Nash equilibrium. The condition was discovered by — in alphabet- 
ical order — Blume and Easley (1990), Palfrey and Srivastava (1987), and 
Postlewaite and Schmeidler (1986); see also the discussion by Postlewaite 
and Schmeidler (1987). 

An important research topic was the elucidation of necessary conditions 
and sufficient conditions for Bayes-Nash implementation. This work has re- 
sulted in a huge literature, including the articles by Blume and Easley (1990), 
Palfrey and Srivastava (1987), and Postlewaite and Schmeidler (1986) men- 
tioned above. Contributions by Palfrey and Srivastava (1989) and Jackson 
(1991) are especially relevant here; Jackson’s (1991) result for economic envi- 
ronments will be discussed in detail in the following section because he does 
obtain a set of conditions which are both necessary and sufficient. Further 
literature includes articles by Matsushima (1988, 1991) and Palfrey and Sri- 
vastava (1986, 1991). See also the recent survey by Palfrey and Srivastava 
(1992) and the background material on games with communication due to 
Forges (1986) and Myerson (1986). 

Palfrey (1992) focuses on the problem of multiple equilibria for Bayes- 
Nash implementation. Mechanisms — even those direct mechanisms for which 
truth telling is a Nash equilibrium — typically exhibit many Nash equilibria. 
Therefore, the value of the revelation principle may be limited in the sense 
that some allocation could well be implementable as the unique equilibrium 
of some mechanism while being only one of a plethora of equilibria of direct 
mechanisms for which the given allocation arises as the truthful equilibrium. 
The notion of full implementation addresses this issue, as full implementabil- 
ity of an allocation means that it is implementable as the unique Bayes-Nash 
equilibrium of some suitable mechanism. 

Ledyard (1986) expounds a critique of the concept of implementation. 
Using the mild hypotheses of strictly positive prior probabilities and mono- 
tonically increasing transformations of utilities, he points out that any un- 
dominated outcome can be rationalized as a Bayes-Nash equilibrium of some 
game. Of course, this means that Bayes-Nash implementation doesn’t lead 
to interesting restrictions unless one either tightens the requirements of the 
definition of implementation (for instance, by requiring full implementation), 
restricts the class of allowable games, or insists on some refinement of Bayes- 
Nash equilibrium. 
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4 Jackson’s Article 

Jackson (1991) provides a set of necessary and sufficient conditions for Bayes- 
Nash implementation with or without the hypothesis of publicly predictable 
information. Previous work using the PPI assumption found two necessary 
conditions for Bayes-Nash implement ability: incentive compatibility (also 
called self-selection) and a Bayesian analogue of Maskin’s (1977) monotonic- 
ity condition. [See Postlewaite and Schmeidler (1986), Palfrey and Srivastava 
(1987), and Blume and Easley (1990).] In attempting to find a converse result, 
Jackson (1991) adds a closure condition [which was somewhat implicit in the 
Palfrey and Srivastava (1987) and Postlewaite and Schmeidler (1986) assump- 
tions] that one can always “patch together” allocation functions at common 
points in players’ information partitions. [The operation is reminiscent of Sav- 
age’s (1954) treatment of personal probability and expected utility.] Subject 
to technicalities, incentive compatibility, Bayesian monotonicity, and closure 
are together necessary and sufficient for Bayes-Nash implementation. 

For Jackson’s (1991) theorem, we consider an exchange economy with at 
least three traders and strictly monotone utilities for every trader in every 
state of the world. The formulation could allow for public goods and exter- 
nalities. To fix notation, let = {1, . . . , n} be the set of economic agents and 
let -<i denote trader Vs preference relation. I shall define and explain termi- 
nology after stating the result. To simplify, I restrict attention to economic 
environments; see Jackson (1991) for extensions to more general situations. 

Theorem A social choice set is implementable if and only if there exists a 
social choice set F which is equivalent to F such that F satisfies (IC), (BM), 
and (C). 

Definition 1 A social choice set F is a subset of the set of all social 
choice functions. In symbols, if 5 = S\ x ... x Sn, where for i G AT, 5^ 
is the finite information set of player i, and if A denotes the set of all 
feasible acts, which are assumed to be independent of elements in the set 

5 (i.e., let A be the set of state-dependent allocations that are resource- 

feasible, given traders’ initial endowments G for z G A^, so that 
A - {{xi : S -KDigjvl all s e 5, then F 

is a subset of X = {x\x : S A}. 

We say that two social choice sets are equivalent if they are equal almost 
surely. Consequently, we need only work with those social choice sets that 
are defined on some convenient subset of 5 of full measure. If every s = 
(si,...,Sn) G S occurs with strictly positive probability, no two distinct 
social choice sets can be equivalent; in this case, the theorem reduces to the 
statement that F is implementable if and only if it satisfies {IC), {BM), and 
(C). 
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Definition 2 A social choice set F satisfies condition (IC) if for all i e N, 
all X e F, all s G 5, and all U e Si, x{s) bi («i) where (si) 

denotes trader Vs preference relation when his information set is Si £ Si. 

Definition 3 A social choice set F satisfies condition {C) if for all common 
knowledge partitions {5', 5"} of S and all x,y e F, there is z e F such that 
z(s) = x{s) if s e S' and z{s) = y{s) if s G 5". 

The closure condition is needed because equilibria of mechanisms can 
similarly be patched together based on common knowledge events. If a mech- 
anism has two Bayes-Nash equilibria — call them x and y — then it must also 
have a third equilibrium, > 2 ;, defined by doing x on part of 5 and y on the 
other part of S, providing that S can be divided into two or more pieces that 
are common knowledge. 

Definition 4 F satisfies condition (BM) if, whenever x E F and a = 
(ai , . . . , an) is a deception, where ai : Si Si for all i E N and xoa denotes 
the social choice function with outcomes x(o;(s)) = x(ai(si), . . . , an(sn)) for 
all s = (si, . . . , Sn) € S, and whenever there is no social choice function in 
F which is equivalent to xoa, then there exists i E N, Si E Si and y E X 
such that [y o a) yi{si) {x o a) and x V ° Oii{si) for all U E Si, where 

{yo (aj(sj)))(s) = y(s)i(,ai(si)) for all s e S. 

An interpretation of the Bayesian monotonicity condition is as follows 
(ignoring equivalence): If a mechanism implements F and if x G F, then 
there is an equilibrium cr (of the game defined by the mechanism) which 
yields x. If agents use deception a, they obtain xoa. If there does not exist 
a social welfare function in F which is equivalent to x o a, then a o a cannot 
be an equilibrium. Bayesian monotonicity ensures that, in fact, cr oa isn’t an 
equilibrium. The idea is that agent i uses y to signal that a is being played; 
this makes trader i happier. The second condition says that player i cannot 
gain by falsely accusing others of deception. 



5 Cooperative Implementation 



By definition, implementation is a noncooperative concept; it requires alloca- 
tions to arise as (truthful) Nash equilibria. Perhaps the most straightforward 
way to include the consideration of coalitional behavior is to replace Nash 
equilibrium with strong equilibrium. A disadvantage of this approach is that 
strong equilibria may not exist in general noncooperative games, whereas 
there are always Nash equilibria, at least in mixed strategies under fairly 
general technical conditions. A more radical strategy is to examine the possi- 
bilities for attaining outcomes as some cooperative solution in a game. In this 
case, the precise application of incentive compatibility constraints is unclear. 
Should one worry about incentives to lie within a coalition that is cheating? 
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Are blocking allocations required to be incentive compatible? Such consid- 
erations seem to have the flavor of bargaining sets (i.e., objections versus 
counterobjections) or coalition proofness. 

Incentive compatibility can be incorporated into cooperative games on 
three levels. First, one can And some solution set and ask whether it satisfles 
incentive compatibility or, as a weaker alternative, at least contains some out- 
come satisfying incentive compatibility. This is the approach taken by Krasa 
and Yannelis (1994) and Koutsougeris and Yannelis (1992). Secondly, one 
can require incentive compatibility only in the deflnition of feasible actions 
for the grand coalition. This approach implicitly appears in the second-best 
efficiency considerations for the planner in the literature on incentives and 
mechanism design. Finally, one can consistently require incentive compatibil- 
ity for the deflnition of feasible agreements for all coalitions. This strongest 
use of incentive compatibility treats all coalitions symmetrically but possesses 
the disadvantage of possibly leading to games that violate some of the stan- 
dard properties one expects. This tack is followed in Allen (1991, 1992, 1993, 
1994). 

A further factor which complicates the analysis is that games without 
transferable utility are more appropriate when incentive considerations are 
present. To summarize the worth of a coalition by a single number — as is 
done in the deflnition of cooperative games with transferable utility (or TU 
games) — suggests that members of the coalition share a single objective func- 
tion. Yet, if these players were indeed a team, they would necessarily be will- 
ing to share their information fully and honestly in order to better maximize 
the total payoff accruing to the coalition. This contradicts the spirit of in- 
centive compatibility, which hypothesizes that players will hide information 
or will lie to further their own goals. 

Finally, one can ask whether participation or individual rationality con- 
straints should be imposed. Requiring that all players be willing to play the 
game is natural for some mechanism problems, as it is a weaker rationality re- 
quirement than Bayesian incentive compatibility. On the other hand, in a co- 
operative context, most solution concepts are automatically — by deflnition — 
individually rational, although out-of-equilibrium behaviors such as blocking 
and objecting may not always be individually rational compared to nonpar- 
ticipation. Moreover, ex ante and ex post individual rationality are distinct 
concepts. The latter restricts risk sharing so that its imposition can prevent 
efficient outcomes such as those obtainable with fair insurance contracts. 
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6 Economies with Asymmetric Information 

Consider a pure exchange economy with agent set N = {1, . . . ,n} in which 
i? is a finite set. To simplify, assume that every state of the world occurs 
with strictly positive probability and that these probabilities are common 
knowledge. Agents’ information either consists of partitions on i? or can be 
specified by signals : i? -> where each Si is also assumed to be a 

finite set. Write 5 = Ui^NSi and s = (si,...,Sn). Consumption sets are 
and initial endowments are G M\. for z G N. Endowments are as- 
sumed not to depend on i? or 5 in order to guarantee that initial endow- 
ment vectors are incentive compatible and hence that there exist incentive 
compatible feasible allocations. Preferences are specified by state-dependent 
cardinal utilities Ui : x fi M where, for every i e N and every a; G i?, 

-> iR is continuous, strictly monotone, and strictly concave. 

The classical incentive compatibility constraints are given by the restric- 
tions that allocations : i? — > iR^ must satisfy, for alH G iV and all a; G i7, 
Ui[x(uj)]ijj) > Ui{xi{uj')\(jS) for all G f?. Note that these constraints ap- 
ply to every player regardless of the coalition to which he belongs. They are 
written in “overkill” fashion, as if each player were able to distinguish all 
states rather than in a form that refiects the player’s individual information 
(which could depend on his coalition) . Think of these incentive compatibility 
constraints as restrictions on the state-dependent consumption set of each 
agent. 

Alternatively, for a framework in which traders receive signals about the 
state of the world, Bayesian incentive compatibility requires 

Ui(xf(s(w));w)/Lti(w|s<) > 

G 

for all Si e Si, all s[ e Si, and all i E N, where the allocation Xi : iR^ 

must be measurable with respect to the signals s(-) = (si(-), . . . ,Sn(*))> 3,nd 
fjii{u\si) denotes player i’s posterior probability of a; G i?, given that he or 
she has observed signal Si E Si. 



7 Incentives with Asymmetric Information 

The study of cooperative solution concepts for economies with incentive con- 
siderations has focused primarily on the core, although the value has also 
been examined. One approach that has proved useful is to analyze the coop- 
erative games with nontransferable utility that are generated by (exchange) 
economies with incentive compatibility constraints. Thus, one defines the co- 
operative games V : 2^ with nontransferable utility (or NTU games) 
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by y(0) = IR^ and for T C AT with T ^ 0, V{T) = {K, . . . ,T/;n) e 1R^\ 
for 2 € T, there exists Xi : Q such that, for all fully informed i £ T 

and all cj, cj' € i7, Ui[xi{(jj)]uj) > Ui(xi(cj');a;), where 5].^^ Xi(cj) = 
for all cj G i? and Wi < alH G T}, where /i(cj) 

is the probability of state lj and agents in N are assumed to be either fully 
informed (i.e., their information partitions on i? precisely equal 2^) or com- 
pletely uninformed (i.e., their information partitions on f2 are the trivial 
partitions {i?}). One can modify the game to take careful account of players’ 
partial information or to use the Bayesian version of incentive compatibility 
constraints. 

The incentive compatible core was first introduced by Boyd and Prescott 
(1986) in a model of financial intermediation with risk neutrality. They 
demonstrate nonemptiness of the core by showing that certain systems of 
linear inequalities can be solved. Berliant (1992) and Marimon (1989) also 
examine the incentive compatible core for particular economic problems — 
those involving taxation and adverse selection. Allen (1991, 1994) follows the 
approach outlined above of deriving NTU games from economies with (clas- 
sical or Bayesian) incentive compatibility constraints and finds that the game 
need not be balanced and can, in fact, have an empty core. For economies 
with asymmetric information, Koutsougeris and Yannelis (1992) define core 
allocations and check whether they are incentive compatible. 

For the value, Allen (1992) derives the games from economies with (clas- 
sical or Bayesian) incentive compatibility and shows that the value is well 
defined. Krasa and Yannelis (1994) focus on the private information value 
and ask whether the fine, coarse, and private information values satisfy in- 
centive compatibility. 



8 Mechanisms with Asymmetric Information 

Instead of adding incentive compatibility constraints to the definition of the 
games derived from economies with asymmetric information, one can incor- 
porate Bayesian incentive compatible mechanisms into the definition of these 
games. This approach builds on the work of Harsanyi (1967-68) on noncoop- 
erative games with incomplete information and its use by Myerson (1984) to 
model cooperative games with incomplete information. 

Allen (1993) proposes a game containing both cooperative and noncoop- 
erative phases in which the feasible outcomes are taken to be Bayesian-Nash 
equilibrium outcomes of Bayesian incentive compatible direct mechanisms. 
Formally, the entire model is assumed to be common knowledge and, in the 
first strategic phase, players cooperatively pick a Bayesian incentive com- 
patible mechanism. The choice of a mechanism is a binding agreement; the 
commitment is made ex ante. Then, after agents learn their types, the non- 
cooperative game defined by the chosen mechanism is played. Traders send 
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messages (about their types, since we can restrict ourselves to direct mech- 
anisms by the Bayes-Nash revelation principle), which lead to an outcome 
according to the mechanism. The equilibrium concept used in the noncoop- 
erative (mechanism) game phase is Bayesian-Nash equilibrium, which (by the 
revelation principle) can be taken to be truthful. Somewhat more formally, 
the game given by V{S) = . . . ,Wn) € M^\ there exists a randomized 

direct Bayesian incentive compatible mechanism A, and there is a (truthful) 
Bayesian-Nash noncooperative equilibrium a for A such that, if i G S, Vs pay- 
off in A under a is at least as great as Wi}. The use of incentive compatible 
mechanisms in cooperative economic contexts is also studied by Ichiishi and 
Idzik (1992), Page (1992), and Rosenmiiller (1990). 
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1 Introduction 

Cooperative and non-cooperative approaches to game theory represent two po- 
lar, and simplifying, extremes. In the former, it is assumed that players can 
make commitments that are binding, i.e., an agreement once made is enforce- 
able. In contrast, non-cooperative game theory assumes that agreements cannot 
be enforced and equilibrium agreements or strategies, therefore, must be self- 
enforcing. 

The present paper will explore two aspects of the inter-connections between 
these two polar extremes. First, we discuss the issue of consistency in coalitional 
deviations. This is, at least in spirit, a non-cooperative idea but one that can 
be applied even to cooperative equilibrium concepts in the characteristic func- 
tion form. Second, we focus on the growing literature that studies the process 
through which equilibrium cooperative agreements are reached. Even in a set- 
ting in which binding agreements can, in principle, be written (and enforced), 
the negotiations that lead to a final ‘cooperative’ outcome are likely to depend 
on non-cooperative considerations. In particular, the ability to write binding 
agreements does not preclude a player(s) from choosing not to cooperate. The 
decision to cooperate and join a larger coalition will depend on the payoff corre- 
sponding to the non-cooperative equilibrium. This approach, therefore, directly 
incorporates many non-cooperative ideas into what is essentially a cooperative 
framework. 

Cooperative game theory, with its emphasis on the possibilities for coopera- 
tion, has traditionally been associated with games in coalitional form, or char- 
acteristic function form. As a result, the primitive data of the problem being 
studied abstracts from the details of the negotiation process - details that may 
be completely specified in some underlying extensive form game^. One should 
not hasten, though, to the extensive form specification since often it then turns 
out that the outcomes begin to depend too critically on rather small details of 
these procedures. The characteristic function may be viewed as a reduced form 
of a more detailed specification of a model. Its value lies in bringing to the fore 

^We will turn presently to the normal form specification of a game, which can also be seen 
as a primitive basis from which a characteristic function is derived. 
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the cooperative abilities of various coalitions and from that to an analysis of 
cooperation in the grand coalition. 

We shall begin, in the next Section, by considering games in characteristic 
function form to discuss the influence of non-cooperative ideas on cooperative 
game theory or, more precisely, to discuss the notion of consistency in coalitional 
deviations. As a starting point, consider the notion of the core. An outcome is 
said to be in the core if no coalition has an objection to it, i.e., if no coalition 
can, on its own, improve the utility of its members.^ One source of dissatis- 
faction with the core relates to the notion of objections. It seems reasonable 
to argue that the same considerations that qualify an objection as a means to 
discredit a status-quo should (for the same reasons) be used to check if the ob- 
jection itself can be similarly objected to. In other words, objections should be 
tested against counterobjections. The notion of a bargaining set incorporates 
this idea. An imputation belongs to the bargaining set if to every possible ob- 
jection there exists a counterobjection. Variants of this concept differ in their 
precise formulation of objections and counterobjections; the one that we define 
in Section 2 is due to Mas-Colell (1989). While these concepts are very much 
part of the cooperative theory, the idea of checking the credibility of an objec- 
tion has some hint of non-cooperative considerations too. This becomes even 
more explicit if one incorporates fully, or consistently, the ideas of objections 
and counterobjections. One could take the objection-counterobjection idea to 
its logical conclusion by subjecting counterobjections to a similar test, and so 
on. This was formalized as the consistent bargaining set in Dutta et al (1989). 
While this is not a non-cooperative notion, it is reminiscent of subgame perfec- 
tion and backwards induction; the idea that if you do such and such, the very 
same considerations that underlie the solution concept will permit a further 
reaction - a new subgame. 

The coalitional form does not readily lend itself to a study of the formal 
connections between cooperation and non-cooperation. It is the normal form 
specification of a game that seems more appropriate for this purpose. And we 
will argue that this leads to a cooperative theory in which equilibria are very 
critically dependent on the non-cooperative equilibria. We begin in Section 3 by 
a brief review of several core-like solution concepts for normal form games that 
were introduced in Aumann (1961). The essential idea is to construct for each 
coalition, the set of feasible utility profiles and then appeal to the notion of the 
core as the equilibrium concept for the derived game in coalitional form. The 
immediate issue that arises is that of defining what the feasible utility sets for the 
various coalitions are. Since the primitive model is specified as a normal form, 
and the utilities of players depend on the complete strategy profile, the feasible 
utility set of a coalition is not independent of the strategies of the complementary 
coalition. One option is to assume, a la Nash, that the complementary coalition 
keeps its strategy fixed according to the status-quo. For any given status-quo 

^The formal definitions will appear in the next Section. 
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then, we can derive the corresponding coalitional form. The corresponding core- 
like, cooperative solution is the strong equilibrium. A strong equilibrium can 
be viewed as a strategy profile with the property that the corresponding utility 
profile belongs to the core of the associated coalitional form with this strategy 
profile as the status-quo. Another possibility is to take the view that the feasible 
utility set of a coalition should reflect the utilities that it can guarantee itself, i.e., 
a coalition is pessimistic in its outlook and fears the worst from its complement. 
This approach makes it possible to define the feasible sets independently of a 
status-quo. The cooperative solution concept it leads to is the a-core, and the 
related notion of the /3-core, Analogous to the critique of the core and the 
bargaining set in the previous paragraph, it can be argued that these solution 
concepts too, suffer from not consistently taking account of the repercussions 
that follow from an initial objection. This argument is only strengthened by the 
fact the strong equilibrium, the a-core and the /3-core, despite having a common 
connection to the core, lead to very diverse outcomes. Indeed, in the context of 
the normal form, even a purely non-cooperative but coalitional approach that 
consistently takes account of the reactions stemming from an initial objection 
can bring new insights to the issue at hand.^ 

To see how consistency would apply to cooperative theory in the normal form 
consider what happens in a two-player game when one of the players deviates 
from the grand coalition. This corresponds, of course, to each player acting 
independently, or non-coop eratively, and the resulting outcome is then a non- 
cooperative equilibrium, say a Nash equilibrium. This is far more convincing 
a prediction of the result of a deviation than what is prescribed by the strong 
equilibrium, the a-core or the /3-core. It should also be clear that the equilibrium 
cooperative outcome will then depend very critically on the non-cooperative 
equilibrium. A general formulation of this idea was recently developed in Ray 
and Vohra (1994), and will be the subject of Section 3.2. 

The present paper does not deal with non-cooperative coalitional approaches 
in the extensive form, an omission that is amply justified by the papers by Reny 
and Mas-Colell in this volume. 



2 Games in Characteristic Function Form 

Let N = {1, . . . , n} denote the set of players and let Af 2^ \ {0} denote the 
set of all non-empty subsets of N. An element of J\f is referred to as a coalition. 
For any coalition S e Af, let denote the \S\ dimensional Euclidean space 
with coordinates indexed by the elements of S. For u G us will denote 
its projection on R^. We shall use the convention >, >, > to order vectors in 
R^, A game in characteristic function form is defined as {N, ¥( 8 ) 3 ^//), where 
F(5) C R^ refers to the feasible set of payoffs or utilities of coalition 5. 

^The reference here is to the notion of a coalition proof Nash equilibrium developed by 
Bernheim, Peleg and Whinston (1987); see Section 3.1 below. 




130 



The set of imputations is defined as V*{N) = {x £ V(N) \fly £V{N) such 
that y > x}. 

A pair (5,2/), where S £ Af and y £ V{S) is said to be an objection to an 
imputation x if y > xs. 

The core of is defined as 

C{V) = {x £ V{N) I there does not exist an objection to a;}. 

To check the credibility of an objection one can test it against a counterobjec- 
tion, which refers to the ability of some other coalition to make an improvement 
in the utility of its members; for players common to both coalitions the com- 
parison is to their utility in the objection, and for others the comparison is to 
the status-quo. 

Let (5, y) be an objection to x, (T, z), where T £ Af and z £ V{T) is said 
to be a counterohjection to (5, y) if z > {ysnTj^T\s)- 

An objection (5, y) to x is said to be a justified objection if there does not 
exist any counterobjection to (5, y). 

The Bargaining Set is defined as 

B{V) — {x £ V*{N) I there does not exist a justified objection to x }. 

This definition of the bargaining set is based on Mas-Colell (1989), and 
is the one that we shall discuss in the present paper.'* For a comprehensive 
survey of various other notions of the bargaining set the reader is referred to 
Maschler (1992). What is common to this family of solution concepts is the 
idea that an imputation is not excluded simply because it has an objection. 
Objections are taken seriously, and termed justified, only if they do not admit 
counterobjections. If one accepts this objection-counterobjection logic, one is 
immediately faced with the following question: Does a counterobjection itself 
pass the test that objections are now being subjected to? It is compelling 
then, to put counterobjections to the same test that objections are subjected 
to. This idea was formalized by Dutta et al (1989), who defined a notion of a 
consistent bargaining set in which the credibility of counterobjections, and of 
further objections to them, and so on, is consistently evaluated. To define the 
consistent bargaining set we need some additional notation. 

^Zhou (1993) argues that a counterobjecting coalition should also be required to have a 
proper intersection with the objecting coalition. See also Shimomura (1994). 





131 



Consider an imputation x. Define a collection A as {x; (5*, where x 

is an imputation and, for each i = 0, . . . , m, x* is an imputation® for 5^. Define 

b{A) €R^hy 

bj{A) = max{a:j, 



A pair (5,x), where S E Af and x 6 V*{S), is an objection to the collection 

A if 



X > 6 ^(^) 



A collection {a:; is a chain if (5®,x°) is an objection to x, and 

for each i = 1, (S*,x*) is an objection to the collection = {x; 

(5^x^)*a}. 

A pair (5,x) is a terminating objection to the chain A= (x; (*S'^,x^)^q} if it 
is an objection to A such that there is no objection to the chain {x; (5*, 

(5,x)}. 

An objection (5, x) to ^ is valid if there is no valid objection to (5, x)}. 
It is invalid if there exists a valid objection to {A, (5,x)}. 

Given that the number of coalitions is finite, and that an objection is drawn 
from the set of imputations of the dissenting coalition, it follows that all chains 
must be of finite length. Since a terminating objection to a chain is necessarily 
valid, we can work backwards from the valid terminating objections to uniquely 
determine the “label” of each objection. 

The Consistent Bargaining Set is defined as 

CB{V) = {x G V*{N) a valid objection to x }. 

To see, heuristically, how the consistent bargaining set differs from the bar- 
gaining set consider Figure 1. It shows some of the chains that emanate from 
an imputation x. The objections in boxes denote terminating objections to a 
chain. Since the objections and (5^,2/^) both have counterobjections, 

neither one of them can be used to rule out x from being in the bargaining set. 
In fact, (5^, is not only not justified, it is also invalid (since it is terminated 
by (T^, z*^)). However, (5^, y^) is a valid objection if all objections to it are, like 
(T®, ^®) and (T^*, terminated by some other objection.® Moreover, if alf ob- 
jections to X have counterobjections, then, in this example, x E B{V) \ CB{V). 

®The set of imputations for S is defined as V*{S) = {x € V{S) \ ^ y € 

V{S) such that y > x}. 

®Note that figure 1 is extremely incomplete. To establish the validity of it is 

necessary to consider all possible objections to the chain {x, (5^, y^)}, including the objections 
and 

"^Including etc. 
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Figure 1: Valid and Invalid objections. 



It is, of course, easy to see that 

C{V) C CB(V) C B{V). 

There are interesting, special cases in which CB{V) = B(V), In such cases, 
the objection/ counterobject ion logic is itself enough to check the validity of an 
objection. 

Proposition 2.1 (Dutta et al (1989)). In three-player games, and in all strictly 
comprehensive, superadditive, ordinally convex games CB(V) = B(V). 

Proposition 2.2 (Mas-Colell (1989)). In exchange economies with an atom- 
less measure space of consumers B{V) = C{V). 

It is also worth remarking that if all objecting coalitions to a chain are 
required to be subsets of the last coalition in the chain, then again consistency 
has no additional bite. Defining internal consistency by restricting attention to 
‘internal chains’, we can define the internally consistent bargaining set as the 
set of imputations to which there does not exist an internally valid objection. 
Certainly, this set contains the core. Moreover, if there exists an objection which 
is not internally valid, it is possible to find one which is. This implies that the 
internally consistent bargaining set is identical to the core, or that the core is 
internally consistent; see Ray (1989). 
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Notice that the von-Neumann Morgenstern stable set solution is also de- 
signed to look consistently, beyond a single step deviation. In fact, as Green- 
berg (1992) shows, the consistent bargaining set can alternatively be obtained 
by formulating the notion of objections and counterobjections in the context of 
an abstract stable set. He also demonstrates the fact in such solution concepts, 
the results are very sensitive to whether or not objections are defined with strict 
inequalities or semi-strict inequalities. 

Since the bargaining set contains the core, its existence can be established 
under weaker conditions. In transferable utility games, the bargaining set typ- 
ically exists even in games that are not balanced; see Maschler (1992), Vohra 
(1991) and Zhou (1993). Dutta et al (1989) provide an example of a four player 
superadditive, TU game in which the consistent bargaining set is empty. Sim- 
ple, sufficient conditions (weaker than balancedness) under which the consistent 
bargaining set is non-empty are not yet known. 

3 Games in Normal Form 

In this section we shall see how the idea of consistency discussed in the previous 
Section can be applied to non-cooperative and cooperative solution concepts in 
the context of games in normal form. We begin by reviewing some well-known 
coalitional solutions for normal form games introduced by Aumann (1961). 

Let N = {1, . . . , n} denote the set of players. Each player has a strategy set 
Xi C and a utility function Ui : X ^ where X = A game in 

normal form is defined as {N ^ Xi^ui). We shall use us{x) to denote (iti(x))ie5 
and —5 to denote the coalition N \S, 

A strategy profile x E X is said to be a Strong Equilibrium (SE) of a game 
(iNT, Xi^Ui) if there does not exist a coalition S and xs € Xs such that 

us{xs,x-s) > us{x). 

Given x € X, we can define for each coalition a set of feasible utilities, 
conditional on the complementary coalition’s strategies as fixed. Let 

V{S\x) = {tz G R^ I there exists xs € Xs such that u < us{xs,x-s)}- 

Thus for every x G X we can define a game in characteristic function form 
(X, F(.;x)) and x is a strong equilibrium for the original game if and only if 
u{x) belongs to the core of (A, V(.;x)). 

A strategy profile x G X is said to be in the a-core of a game (A, X^, ui) if 
there does not exist a coalition S and x^ G Xs such that for all y-s € X_ 5 , 

us{^3, y~s) > 'W5(a^)‘ 

A strategy profile x G X is said to be in the /3-core of a game {N,Xi,Ui) if 
there does not exist a coalition S such that for every G X _5 there exists 
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^ Xs such that 



us{xsyy-s) > us{x). 



Observe that 



SE Q 0 ^ core C a — core. 



While a strong equilibrium exists only in very restrictive cases, Scarf (1971) 
showed that in an exchange economy with externalities, the a-core is non-empty 
if all utility functions are quasi-concave. 

All these solution concepts are designed to focus on strategy profiles that are 
immune to objections. However, in contrast to games in characteristic function 
form, one must now also specify what the complementary coalition is assumed 
to do in the face of an objection. The three alternatives above consider the case 
in which the complementary coalition is assumed to keep its actions unchanged 
or to act in a way that minimizes the benefits to the dissenting coalition. In 
light of the discussion of the previous Sections we need not dwell on the fact that 
each of these assumptions is somewhat ad-hoc. Take the notion of the strong 
equilibrium, and consider the idea of checking the credibility of an objection. In 
particular, we can ask if a sub coalition from the objecting coalition can make 
a similar objection (counterobjection). In fact, one can develop this idea in 
a purely non-cooperative vein as has been proposed by Bernheim, Peleg and 
Whinston (1987). 



3.1 Consistency in a Non-Cooperative Coalitional Theory 

Consider a model in which the players can communicate freely but cannot make 
binding commitments. In a two-player game this would allow refining Nash 
equilibria to those that are Pareto undominated by other Nash equilibria. In 
games with more than two-players, this idea can be applied recursively pro- 
vided one restricts attention to internal deviations, i.e., assume that dissenting 
coalitions must be subcoalitions of previously objecting coalitions. 

Given y-s € X-s, we define the utility function of z G 5 as ili : Xs R 
as Ui{xs) = Ui{xs, y-s)- We can now define the game T(y-s) induced on S by 
the actions y-s of the coalition —5 as 



r(y_s)=(5, {Xi,Ui)i^s)- 



In a game with a single player z, Xi G is a coalition-proof Nash equilibrium 
if and only if Xi = Argmaxjj^.iti(.). 

Let n > 1 and suppose coalition-proof Nash equilibrium has been defined 
for games with fewer than n players. Then 

(a) For a game F with n players, x is self- enforcing if for all coalitions S such 
that |5| < n, xs is a coalition-proof Nash equilibrium in the game T{x-s). 
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(b) For a game F with n players, a; is a coalition-proof Nash equilibrium if it is 
self-enforcing and there does not exist another self-enforcing y such that 
Ui{y) > Ui{x) for all i = 1, . . . , n. 

The assumption that a deviating coalition only takes into account further 
dissension from within, and not from coalitions that contain members from its 
complement, is a strong one. Its power lies in being able to develop the concept 
recursively. Recall that in defining the consistent bargaining set we could rely 
on the finiteness of chains to work backwards and label objections as valid or 
invalid. In a normal form game, the analogous notion of chains would lead, 
in general, to chains that are not of finite length - unless we confine ourselves 
to internal deviations. By concentrating on internal objections, it is possible 
to develop coalition proof Nash equilibrium by defining chains of objections 
in the normal form and following the steps in the definition of the consistent 
bargaining set. It can also be developed by appealing to the notion of abstract 
stable sets, as shown by Greenberg (1990).® While it is important to extend 
these ideas beyond the confines of internal objections, this issue has not yet 
been completely resolved; see, however, Chakravorti and Kahn (1991). 

3.2 Consistency in a Cooperative Theory 

We now turn to an application of consistency to a purely cooperative theory 
in the normal form. At the outset, it is important to keep in mind a simple 
rule of thumb; for a solution concept to qualify as a cooperative solution it 
ought to have the property that in the prisoner’s dilemma it picks out the 
‘cooperative outcome’ and rules out the ‘non-cooperative outcome’. This the 
a-core and the /3-core satisfy; the strong equilibrium is empty in the prisoner’s 
dilemma. However, these solutions are not, in general, consistent. For a concrete 
motivation, consider a cooperative solution to the Cournot duopoly with a linear 
demand curve and constant average costs. The strategy set of each firm is the 
real line - representing its output level. In this example too, there exists no 
strong equilibrium. The a-core, which in this example is identical to the /3- 
core, consists of all strategies that are individually rational and Pareto optimal. 
In particular, one firm producing zero and the other producing the monopoly 
output is a strategy profile in the a-core. The reason is simply that neither firm 
can object to getting zero profit if it assumes (as the a-core prescribes) the 
very worst, i.e., if it assumes that its rival will flood the market. It would be 
reasonable to argue that when a coalition deviates it should not take as given 
the strategies of its complement, nor should it fear the worst. It should look 
ahead to a resulting ‘equilibrium’ that its actions induce. The real issue then, 
is to describe what the equilibrium is that emerges when one firm breaks away 

®In both these alternative approaches, however, there is a complication that creeps in if 
the set of self- enforcing strategies is not compact; see Claim 7.2.5 in Greenberg (1990) and 
Kahn and Mookherjee (1992). 
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from the grand coalition, i.e., the outcome that corresponds to the break-up of 
cooperation. 

We take the view that a break down of cooperation induces a non-cooperative 
situation. Taking Nash equilibrium as the ‘equilibrium’ corresponding to the 
state of non-cooperation then, no firm should fear that by breaking off nego- 
tiations it will receive less than the Nash equilibrium profit. The equilibrium 
cooperative outcome would then be the set of all Pareto optimal outcomes in 
which each firm receives at least its Nash payoff. Somewhat more generally, 
this line of argument leads to the conclusion that in a two-player game with a 
unique Nash equilibrium, a solution concept we seek should predict as equilibria 
all Pareto optimal outcomes that weakly dominate the Nash equilibrium. The 
problem, of course is more complex in games with three or more players, but this 
basic idea has recently been formalized by Ray and Vohra (1994) in the notion 
of equilibrium binding agreements. The problem also becomes more interesting 
in games with more than two players because then full cooperation in the grand 
coalition and pure non-cooperation are not the only possibilities; partial break 
down of cooperation in the form of some intermediate coalition structure also 
becomes a possibility; see example 3.2.1 below. 

There are three basic ingredients in the concept of equilibrium binding agree- 
ments: 

1. Equilibrium binding agreements are meant to capture the idea that any 
coalition can, in principle, write a binding agreement among its members 
but this agreement must be independent of the actions of players outside 
this coalition. Non-cooperative play across coalitions is modeled a la Nash. 
Thus, one feature of this equilibrium concept is that if in equilibrium the 
coalition structure that emerges is P, then the equilibrium strategy profile 
X must satisfy the best response property with respect to P. Formally, 
this is the requirement that for every 5 G P there is no ys E X s such 
that us{ysi^-s) '> us(x). Let 0{V) denote the set of best response 
strategy profiles for P. A necessary condition for x to constitute a binding 
agreement for P is that x G P{V). 

2. It is assumed that agreements can only be written between members of 
an existing coalition. Thus, deviations can only make an existing coali- 
tion structure finer - mergers are ruled out. Recall that this is also the 
assumption on which the notion of a coalition proof Nash equilibrium is 
based. And here again the strength of this assumption lies in allowing for 
a recursive definition. 

3. Deviations from an existing coalition structure must be based on a con- 
sistent prediction of any further reorganization of the coalition structure 
that may follow from the initial deviation. 

Notice that these conditions immediately yield the conclusion that for the 
finest coalition structure, P*, the set of equilibrium binding agreements, denoted 
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J3{P), is precisely the set /3{V), which is the set of Nash equilibria of the game. 
In other words, if all players are in singleton coalitions, the equilibrium binding 
agreements are precisely the Nash equilibria of the game. 

To describe binding agreements for an arbitrary coalition structure we will 
consider the simple case in which there are only three players; for the more 
general case the reader is referred to Ray and Vohra (1994). It will be useful 
to begin by describing the corresponding story. Consider the three players 
negotiating in this grand coalition and discussing the possibility of agreeing 
upon a weakly Pareto optimal strategy profile x. Suppose player i contemplates 
leaving the grand coalition expecting to do better than x. After such a deviation 
there are only two kinds of coalition structures that can eventually emerge. One 
possibility is that this coalition structure, ({i}, {j, k}), is ‘stable’. The other 
possibility is that this coalition structure breaks up into V*. In the first case, 
player i should deviate if there exists an ‘equilibrium binding agreement’ x' 
for the coalition structure ({i}, (j, A;}) such that Ui{x') > Ui(x). In the second 
case, i should deviate if there exists y € 0{V*) such that Ui{y) > Ui{x), This 
provides a consistent set of rules for player i to decide whether or not to sign 
the agreement x. The only concept left undefined so far is the notion of ‘an 
equilibrium binding agreement’ for the coalition structure ({i}, (j, k}). But this 
is easy to do, using similar arguments, given that the only possible change in this 
coalition structure is to P*, which will arise if and only if there exists y E B{P*) 
such that either Uj{y) > Uj{x') or txfc(y) > Uk{x^), Similar considerations can 
be used to determine whether or not a two-player coalition would agree to sign 
the agreement x. 

We can now provide the formal definitions based on this discussion. If iV = 
{1,2,3}, there are only five different coalition structures, N = {1,2,3}, V* = 
({!}, {2}, {3}) and P* = ({z}, N_i), z = 1,2, 3. Consider the coalition structure 
V''. Suppose X G /3{V^). By assumption the only subcoalitions that can deviate 
for this coalition structure are {j} and {k}, where j, k ^ i. And in the coalition 
structure that they induce, P*, the equilibrium outcomes are B{V*) = P{V*). 
Given x € /^(P*), we say that (P*,x') blocks (P^x) if x' € B{V*) and there 
exists j ^ i such that Uj{x') > Uj{x). We can now define B(P^) as the set 
of X G that are not blocked. Having defined the equilibrium binding 

agreements for the intermediate coalition structures, we can now consider the 
blocking possibilities starting from the grand coalition. Note that P{N) is the 
set of all weakly Pareto optimal strategies. There are two kinds of coalition 
structures that can be used to block (A, x), where x G We say that 

(P*,x') blocks (iV,x) if x' G B(P^) and either zzi(x') > Ui{x) or zzj(x') > Uj{x) 
for j ^ z, i.e., either a single player can do better by deviating or a coalition 
of two players can do better by deviating. Note that in each case, the new 
coalition structure P^ is immune to further deviations, since x' G B(P^). It is 
also possible that (P*,x'), where x' G B{P*), blocks (N,x). This can happen 
in two ways, either there exists z such that Ui(x') > Ui{x) and B{V^) = 0, or 
there exist z, j such that zzi(x') > Ui{x) and Uj(x') > Uj{x). And B{N) is the 
set of X G 0{N) such that it cannot be blocked. 
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Figure 2: An Example of Inefficiency. 



The concept of equilibrium binding agreements can be used to address an 
important issue concerning the efficiency of equilibrium outcomes when binding 
agreements are feasible. It is widely believed that if there are no informational 
imperfections, then the ability to make binding agreements must result in all 
gains from cooperation being exploited. As we have already indicated, this 
conclusion is valid in two-player games. More precisely, in a two-player game 
with a unique Nash equilibrium the outcome will be efficient; the set of binding 
agreements correspond to all weakly Pareto optimal strategy profiles that weakly 
dominate the Nash equilibrium. But, surprisingly, in games with three or more 
players, the theory does not bear out the conclusion that equilibrium binding 
agreements are always efficient. It is possible that every Pareto optimal outcome 
that dominates the Nash equilibrium is blocked by a coalition, and leads to an 
inefficient outcome with an equilibrium coalition structure that is neither N nor 
V*. This is shown by the following example, which will also serve to illustrate 
the notion of an equilibrium binding agreement. 

Example 3.2.1 (Ray and Vohra (1994))- 

Consider a three-consumer economy with one private good and one public 
good. Let Xia, Xib and Xic denote, for consumer i, successively higher levels of 
contribution towards the public good. It is possible to specify well behaved, 
quasi-linear utility functions such that they result in the normal form game 
depicted in Figure 2, where player 1 chooses rows, player 2 chooses columns and 
player 3 chooses matrices. 
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We claim that in this example there is no efficient equilibrium binding agree- 
ment, and that the grand coalition breaks up into an intermediate coalition 
structure. Notice first that every player i has a dominant strategy, Xia- Thus 
the unique Nash equilibrium, and the only equilibrium binding agreement for 
V*, is {xia,X2ai^3a), which is Pareto dominated by {xic,X 2 c, xzc)- 

Next, we examine the equilibrium binding agreements for an intermediate 
coalition structure P = ({i}> Since the game is symmetric, there is no 

loss of generality in considering the coalition structure P = ({3}, {1,2}). Since 
player 3’s dominant strategy is xsa, any z G 0{P) must be such that zs = x^a- 
Thus we need only look at the first matrix. Clearly, both (xia, X2a) and (xic, X2c) 
are dominated by (xi^, X 2 b)- In fact, it is easy to see that (xi^, X 2 b, 0 : 3 a) G 0{P). 
Moreover, this strategy cannot be blocked by a deviation to P*. It is, therefore, 
an equilibrium binding agreement. Indeed, this is the only one for this coalition 
structure. To see this, notice that in all other best response equilibria, either 
player 1 or player 2 receives less than 2.6, the unique Nash payoff. Since the game 
is symmetric, we can now claim that for every intermediate coalition structure 
((Oj equilibrium strategy profile is (xia, x^b). The payoffs 

to i, j and k are 3.7, 2.7 and 2.7 respectively. But this outcome is not efficient. 
It is Pareto dominated by {xib,xjc,xkc)- 

Finally, consider the grand coalition. For any strategy profile it must be the 
case that there exists a player, z, who gets less than 3.7. This player can then 
block this proposal by deviating to ({z}, [j, A;}) and earning 3.7. This in fact, 
is the only coalition structure that z can induce by deviating from the grand 
coalition. Thus, the grand coalition breaks up into some intermediate coalition 
structure with an inefficient equilibrium. And the only equilibrium in the finest 
coalition structure too is inefficient. Since all the best response equilibria are 
strict, it follows that this example is robust. 

A more detailed analysis of the public goods model is contained in Ray and 
Vohra (1994). The Cournot oligopoly is another interesting model to which 
these ideas have been applied; see Bloch (1992), Ray and Vohra (1994) and Yi 
(1993). 

A drawback of the approach we have described is that it is restricted to 
internal deviations - it is assumed that deviating coalitions cannot merge with 
others. While it is possible to define a more general solution concept, it seems 
difficult to do so while retaining the relative transparency and tractability that 
comes with a recursive definition. For example, consider Greenberg’s (1990) 
framework of abstract stable sets. This approach does incorporate consistency, 
and Greenberg suggests two new notions for normal form games, contingent 
threats equilibrium and coalitional commitments equilibrium, that appear to 
be related to the idea of binding agreements. However, the former does not 
exist even in the prisoner’s dilemma and the latter does not always predict 
cooperation even in two-player games.^ It is also possible that further progress 

is also worth keeping in mind Chwe’s (1994) critique of the abstract stable set solution, 
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on this issue might come from studying an extensive form model of coalitional 
bargaining. And one can expect that progress here would also help in extending 
the notion of a coalition proof Nash equilibrium beyond the case of internal 
deviations. 
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The Theory of Social Situations (TOSS) is an integrative approach to the 
study of formal models in social and behavioral sciences. TOSS unifies the 
representation of “cooperative” and “non-cooperative” social environments, 
allowing for diverse coalitional interactions. It does so using the notion of a 
(social) “situation”. TOSS disassociates the solution concept from the rep- 
resentation of social environments. The unified solution concept in TOSS — 
“stable standard of behavior” — employs stability as the sole criterion. One 
of the important merits of TOSS is that by representing a social environment 
as a situation, it specifies the exact negotiation process and the way in which 
players and coalitions use the set of outcomes (actions, alternatives) avail- 
able to them. Moreover, the fiexibility of TOSS enables the analysis of social 
environments that cannot be studied within the classical paradigm of game 
theory. This lecture is divided into three parts: (1) Motivation for the no- 
tion of a social situation, (2) Formal definitions of a situation and of a stable 
standard of behavior, and (3) Some applications of TOSS to cooperation. ^ 

1. Motivation 

In this section I shall show that none of the three types of games provides an 
adequate representation of a social environment. In particular, the negotia- 
tion process and the behavioral assumptions are not specified when a social 
environment is described as either a cooperative game, or a normal form 
game, or an extensive form game. Thus, for example, within the context 
that is interest to us here, no answer is provided to questions such as: 

What, in fact, is the meaning of forming a coalition — is it a binding 
commitment of the players to remain together, or to never re-negotiate 
with nonmembers, or is it merely a “declaration of intentions”, which can 



^ Unless otherwise referenced, results in this lectme can be found in Greenberg ( 1990 ). 
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be revised? Do players first form a coalition and only then discuss their 
payoffs there, or are the two decisions made simultaneously? 

As a result, many different negotiation processes can be associated with 
each of these three types of games. ^ I shall now illustrate the validity of this 
claim for each of the three types of games. 

1.1 Cooperative Games 

A cooperative game is a pair (A, t>), where N is the nonempty set of players 
and V is the characteristic function which assigns to every coalition S C N 
a nonempty subset of denoted v(S). For S' C A, the subgame (S^Vs) is 
given by vs{T) = v{T) for every T C S. 

Cooperative games were the main object of investigation by von Neumann 
and Morgenstern in their pioneering work (1944), and have been extensively 
studied since then. However, as I shall argue, the description of a social en- 
vironment provided by cooperative games is incomplete. While it specifies 
the set of payoffs available to coalitions if and when they form, it is totally 
silent on the crucial issues of how exactly this set can be used in the nego- 
tiation process and what it means to form a coalition. As a result, different 
negotiation processes and behavioral and institutional assumptions can be 
associated with a given cooperative game. The following is a sample of the 
many different scenarios in which the game can be played. 

(1) A coalition first forms and only then its members decide on the distri- 
bution of payoffs within the coalition. Moreover, once a coalition forms the 
game is over (at least for its members). Schematically, 

a; G v(N) ^ (5,t;(5)). 

That is, when x is offered, coalition S can form and then choose the payoff 
it will adopt. No further modifications are thereafter possible. 

(2) A coalition forms and decides on the payoff at the same time. Moreover, 
once a coalition forms the game is over (at least for its members). Schemat- 
ically, 



X G v{N) — > {(5, y)\y€ t'(S’)}. 

That is, when x is offered, coalition S can form and adopt a payoff y G v{S). 
No further modifications are thereafter possible. 



^ It is the disparate solution concepts for these three types of games that often involve 
behavioral and institutional assmnptions that should be part of the description of the social 
environments . 
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(3) A coalition first fornris and then decides on the payoff distribution. More- 
over, once a coalition forms its members will never again approach nonmem- 
bers. However, a subset of the members of the coalition can further deviate, 
and then decide on its payoff. Schematically, 

€ v(N) (5, vs); y G «(5) ^ (T, vt); z € v{T) . . . 

That is, when x is offered, coalition S can form. Once S forms, its members 
decide upon the payoff vector y G '^^(5') that it intends to adopt. At this time, 
another coalition, T, which is a subset 5, can in turn form, and then decide 
on the payoff vector G v(T) that it intends to adopt, and so on. 

(4) A coalition forms and decides on the payoff distribution at the same 
time. Moreover, once a coalition forms its members will never again approach 
nonmembers. However, a subset of the members of the coalition can further 
deviate in the same manner. Schematically, 



^ G v{N) {(5, y)\ye v{S)} {(T, z) \ z e v{T)} -^ . . . 

That is, when x is offered, coalition S can form and adopt y G v{S). Another 
coalition T C S can, in turn, form and adopt 2 : G v{T), and so on. 

The distinctive feature of all the above negotiation processes can be viewed 
as follows: once a coalition S objects to a proposed outcome x G its 

members “leave the room” and will never negotiate again with players in 
N \ S. Although the above negotiation processes are different, the “stable 
standard of behavior” for each of them leads to the an important solution 
concept in cooperative games - the core. This is also true for the following 
procedure: 

(5) A coalition forms and proposes a payoff for the entire society. Moreover, 
once a coalition forms the game is over. Schematically, 

X e v{N) {{N,y) I y G v{N),y^ 6 11(5)}. 

That is, when x is offered, S can propose a payoff vector y that is feasible 
both for the grand coalition and for S itself. 

There are, however, many other negotiation processes that can be associ- 
ated with a cooperative game. In particular, individuals can engage in “open 
negotiations”, i.e., every offer or counter offer has to include the members 
of the entire society. And, no coalition is excluded from making counter 
proposals to the one which is currently offered. In particular, members of 
N \ S remain active throughout the negotiation process. The following two 
negotiation processes belong to this category: 
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(6) Open negotiation among all players: When a coalition S C N offers a 
payoff (to the entire society), any coalition T C N (not necessarily a subset 
of S) can propose another payoff (again, to the entire society). Moreover, 
every offer and counter-offer has to be feasible for the proposing coalition as 
well as for the entire society. Schematically, 

X e v{N) {{N,y) I y G v{N),y^ G v(S)} 

{{T,z)\zev{N),z’^ ev(T)} — ... 

The stable standard of behavior for this negotiation process yields the von 
Neumann and Morgenstern solution. 

(7) Updating “reservation prices” according to the last “tender offer”: As- 
sume that a payoff x is offered. A coalition S can object to x if there is an 
S'-feasible payoff G v{S) which makes each member strictly better off, that 
is, ^ x^ . The new modified offer then becomes y = (y^ Now, 
another coalition, T, may object to y if there is a payoff z'^ G v(T) such that 

^ y^ ‘ The resulting new modified offer is then 2 : = and the 

bargaining process continues in this manner. Schematically, 

X G v{N) {{N, y) \y^ e v{S), y^'^^ = 

{(T, z)\z'^ e v(T), z^\'^ = y^\'^) — . . . 

Observe that in contrast to the negotiation process (6) where each tender 
offer has to be feasible for the entire society (i.e., belong to v{N))^ the bar- 
gaining procedure described here is such that a coalition S does not have to 
offer a payoff that is feasible for the entire society, but only a payoff that is 
S'-feasible. The stable standard of behavior for this negotiation process leads 
to the stable bargaining set (see Greenberg 1990 and Greenberg 1992a). 

1.2 Normal Form Games 

As is the case with cooperative games, the description of a social environ- 
ment as a normal form game is also inadequate. Again, different negotiation 
processes can be associated with a normal form game. Recall that a normal 
form game is a triple G = (A, where N is the set of play- 

ers, is a nonempty set of strategies of player i, and u* is player z’s payoff 
function, u* : > 3?. For S' C A, let Z^ denote the Cartesian product of 

Z® over i G S, i.e., Z^ = riiG5 ^ ' 

In order to know how the players can use their strategy sets, we must answer 
the question: given x G Z^ ^ what can a coalition S C N do? Clearly, the 
answer to this question depends on the negotiation process that is applied. 
The following is a sample of the many different scenarios in which the game 
can be played. 
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(1) When a coalition forms it assumes that nonmembers will adhere to the 
proposed recommendation. Moreover, once a coalition forms the game is over 
(at least for its members). Schematically, 

That is, when x E is offered, members of a coalition S may decide 
to deviate from E Z^ , assuming that members of TV \ 5 will continue 
to follow x^^^ . Moreover, once S is formed, it will never dissolve, that is, 
forming a coalition is a binding agreement. The stable standard of behavior 
for this negotiation process yields the strong Nash equilibrium. 

(2) When a coalition forms it assumes that nonmembers will adhere to the 
proposed recommendation. Moreover, once a coalition forms its members 
will never again approach nonmembers. However, a subset of the members 
of the coalition can further deviate. Schematically, 

^^Z^scN I / G Z^} ^ 

\ z'^ e Z'^} . . . 

That is, when x E Z^ is offered, members of a coalition S may decide to 
deviate from x^ E Z^ ^ assuming that members of TV \ 5 will continue to 
follow x^'^^ . However, forming a coalition is not binding; after S deviates, a 
coalition T, which is a subset of S can then decide to further deviate. The 
stable standard of behavior for this negotiation process yields the coalition 
proof Nash equilibrium. 

(3) Forming a coalition is transitory, and is used to openly negotiate with the 
entire society. Every coalition can counter a proposal. Schematically, 

a; e ^ I / G ^ 

{(T;(xWuT}^^5\T_^T)|^Tg^T} 

Thus, in contrast to (2) where only subsets of a deviating coalition can further 
deviate, this negotiation process does not restrict further deviations to subsets 
of the deviating coalition. 

(4) Forming a coalition allows its members to (irrevocably) commit to a 
(correlated) choice of strategies. Schematically, 

^ ^ I / G Z^} 

{{N\{SUT};{x^\^^^^\y^,z'^) \z'^ eZ^}... 

That is, when x E Z^ is under consideration, a coalition S C N can object 
to X and choose an 5-tuple of strategies E Z^ ^ to which it henceforth 
commits. Another coalition T, a subset of A\ 5, can in turn further commit 
to some z'^ E Z^ , and so on. 




148 



1.3 Extensive Form Games 

The representation of a social environment as an extensive form game is also 
not satisfactory. On one hand, it requires the rigid and exact specification of 
the precise order of moves. On the other hand, each node belongs to a single 
individual, thereby considerably limiting the analysis of coalition formation 
within this framework. Moreover, despite the excessive rigidity concerning 
the order of moves, an extensive form game reveals little about the precise 
negotiation process [for example, who can (effectively) recommend a path 
or a strategy profile, and how can players communicate?]. As a result, a 
game in extensive form can be associated with different social situations that 
represent different negotiation processes. ^ 

Current analysis of extensive form games employs the notions of strategy 
and Nash equilibrium. A strategy is a complete plan that specifies an action 
in every eventuality (even in those that might not arise). Hence it is complex 
and artificial. Moreover, an equilibrium strategy profile forces commonality 
of beliefs on the part of the players. That is, all players have exactly the 
same beliefs concerning other players’ choices of actions, including those ^‘off 
the equilibrium path” . 

One advantage of TOSS is that it allows to analyze extensive form games 
using the more natural notion of paths which specify only the relevant actions 
of the players. (See Greenberg, Monderer, and Shitovitz 1993, and Greenberg 
1994a,b.) Consider, for example, the environment in which before playing the 
game, individuals discuss /suggest how to play the game, that is, which path is 
to be followed. The negotiation process depends, then, on the answer to the 
question: who can, and at what juncture, effectively reconunend a path? The 
following are three (of many) ways in which this question can be answered. 
(1) Upon reaching his ^ decision node, player i can move, induce a subtree, 
and then he, and he alone, can recommend a course of actions (a path) to be 
followed thereafter. Consider, for example, the game in Figure 1.3.1. 

At the root of the tree, player 1 can either induce the subtree and 
suggest one of the two paths in that subtree (for example a) to be followed, 
or induce the subtree and suggest one of the paths in this subtree (for 



^ Another deficiency of extensive form games stems from the “tree structure” : there is a 
unique path from the root of the game tree to a node. It is, therefore impossible to use 
such games to analyze situations in which players have “human rationahty” and thus often 
ignore (perhaps even relevant) information. 

^ Gender- neutrahty of all the masculine noims/pronouns/adjectives is, of course, assumed 
throughout. 
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Figure 1.3.1 



example, bdf). If player 1 induces T^, then it is player 2’s turn to move. 
Player 2 is under no obligation to follow Ts recommendation. He is free to 
induce any of the terminal nodes in that subtree, including 2 :, even if player 
1 suggested the path a for T^. If player 1 induces then player 3 is the 
next to move. Player 3 can induce any subtree of and then recommend a 
path in the induced subtree. For example, he can induce and recommend 
the path ce. This process continues till a terminal node is reached. The 
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stable standard of behavior for this negotiation process yields a refinement of 
subgame perfect equilibria (and in some cases it refines all other refinements). 

(2) A different negotiation process involves only one single discussion concern- 
ing the course of actions to be taken. No “re-negotiation” occurs if players 
deviated from the path which they agreed (but did not commit) to follow. 
Consider, for example, the game in Figure 1.3.2. 





Figure 1.3.2 
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At the beginning of the game players consider following the path ab. As 
in the previous negotiation process, player 1 can induce either the subtree 
or the subtree T^. He can, however, make no new recommendations 
concerning the paths to be followed. Thus, if he induces it is common 
knowledge that the path a is the one that is supposed to be followed. If, 
on the other hand, he induces T^, no such path exists. This negotiation 
process is particularly appealing in view of the fact that most agreements are 
incomplete; they fail to specify what might happen in every contingency that 
might arise should deviations take place. Moreover, players often know what 
is the existing social norm, or have information concerning how the game has 
been played in the past (by other players). This knowledge serves as focal 
point or the status quo. The stable standard of behavior for this negotiation 
process yields the new solution concept, that of “stable paths” (Greenberg 
1994a). A path is stable if it would be followed by rational players were it 
recommended at the beginning of the game. 




Figure 1.3.3 
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(3) TOSS easily accommodates coalition formation. For example, we can 
modify (1) and (2) to allow not only single individuals, but also coalitions, to 
jointly deviate from a proposed path. To illustrate how this is accomplished, 
consider the game in Figure 1.3.3 and the negotiation process in (2) above 
except that now coalitions can form. Suppose that path abe is recommended 
at the root of the game. When u is reached, players 2 and 3 can form a coali- 
tion and coordinate their actions. In particular, player 2 can choose action c 
and player 3 can choose d. Since player 4 is not part of this coalition, (players 
2 and 3 expect that) he will continue to follow the original recommendation, 
namely, choose action e. Thus, coalition {2,3} can, in this case, induce the 
terminal node z. If forming a coalition is not binding, then player 3, upon 
reaching v, might wish to “double-cross” and induce the terminal node z' 
instead. 

2. The Theory of Social Situations (TOSS) 

I now turn to the formal definitions of the two main concepts in TOSS: 
the notion of a “situation” that provides a complete description of the so- 
cial environment, and the notion of “stability” of a “standard of behavior” 
that provides a unified criterion for recommendations that are likely to be 
acceptable to rational, free individuals. 

2.1 Situations 

The concept of situation involves two elements: a 'position that describes the 
“current state of affairs” , and an inducement correspondence that specifies the 
alternatives available to a player if he decides to reject a proposed outcome 
in the present position. Thus, at a given stage, a position specifies the set 
of individuals, the set of all possible outcomes, and the preferences of the 
individuals over this set of outcomes. Formally, a “position” is defined as 
follows: 

Definition 2.1.1. A position, G, is a triple G = {No , Xq, {ua}ieNG)^ 
where No is the set of players, Xg is the set of all feasible outcomes, and 
Uq is the utility function of player i in position G over the outcomes, that 
is, Uq : Xq Si. Thus, for all x, y G Xq, a,nd for all i G Nq, 
if and only if i prefers, in position G, the outcome x over the outcome y. 

The set of outcomes describes the feasibility constraints at the particular 
stage, and not the choices that are likely to, or should, be made. The only 
requirement is that the domain of the utility functions of the players be the 
set of outcomes; players in position G are able to evaluate every outcome in 
Xg. 

Now, consider a position G and suppose that an outcome x* G Xq is 
proposed. For a player i G Nq to be able to decide whether to accept or reject 
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he must know the alternatives available to him. Such a specification is 
provided in TOSS by means of the set of positions^ denoted by 7 ({i} | G,x*), 
that a player i can, then, “induce”. ^ Moreover, he must also be able to 
anticipate what might happen once such a position H G t ({0 I 
induced — in particular, what positions can, in turn, be induced from (the 
induced) position H . This reasoning is applicable to individuals as well as 
coalitions. Denote, therefore, by 7(5 | G,x*) the set of positions a coalition 
S C Nq can induce when an outcome x* is proposed in position G. 

A complete description of a social environment is, therefore, given by the 
following concept of a “situation” . 

Definition 2.1.2. A situation is a pair (7, F), where T is a set of positions, 
and the mapping 7, called the inducement correspondence, which satis- 
fies the condition that for aii G G F, 5 C Nq, and x G Xq, j(S | G, x) C F. 

The requirement that F be closed under 7 guarantees a complete specifi- 
cation of what might happen from any admissible position G G F, when a 
feasible outcome x G Xq is proposed. Note that 7 may assign the empty set 
to some (or even all) coalitions, that is, some coalitions may be unable to 
induce any position at all. ^ 

2.2 Standards of Behavior 

Consider a situation (7,F), a position G G F, an outcome x* G Xq, and a 
position H G 7(5 | G, x*). For members of S to be able to decide whether 
to reject x* and induce Ff, it is not sufficient for the players to know the set 
Xff of potentially feasible outcomes; they must also know the outcomes that 
are expected to result once position H is induced. Such a set will be called 
a “solution for iF”, which can most generally be defined as a subset of Ajf , 
denoted by cr{H). That is, the acceptability of an outcome x* in Xq depends 
on the outcomes that are (predicted to be) accepted in the positions that can 
be induced from G when x* is proposed. It is for this important reason that 
the concept of a standard of behavior is introduced. Given the description 
of the social environment as a situation, a standard of behavior specifies a 
solution to every position in F. 

Definition 2.2.1. Let T be a set of positions. A mapping a that assigns 
to each position G G F a solution^ C X^, is called a standard of 

behavior (SB) for F. 



^ Observe that, in contrast to classical game theory, the actions available to a player are not 
exphcitly modeled. What is important, and modeled by the inducement correspondence, 
is what impact can such actions have. 

® It is convenient to impose the mild restriction that the set of players in each position 
that a coalition S can induce, includes, but need not coincide with, the players in 5, that 
is, for all G 6 F, 5 C No, and x G if -ff G 7(-S' | G,ar), then 5 C Nh- 




154 



An SB, cr, for F is any arbitrary mapping. However, rational players cannot 
be expected to follow a “senseless” standard of behavior. Therefore, for a to 
be adopted, some restrictions on a seem to be necessary. These restrictions 
are manifested by the requirement that the standard of behavior be “stable”. 

2.3 Stability 

Consider a situation ( 7 ,F), and let a be an SB for F. If all players adopt 
cr, it seems reasonable to stipulate that a group of players, 5, will reject an 
outcome, x G Ag, if it can induce a position H G 7(5 | G, a:), whose solution, 
cr{H), benefits all members of S. It is important to note that the rejection 
of X must depend only on those outcomes that belong to cr(Ff), not on the 
entire set of feasible outcomes, Xh> Therefore, we shall say that the SB, cr, 
is internally stable for ( 7 ,F) if for all G G F, a? G c^(G) implies that there 
exist no coalition 5 C Nq and position H G 7(5 | G, a:), such that 5 benefits 
by rejecting x and inducing H , realizing that the solution to H is given by 
cr{H). That is, if cr is internally stable, then accepting cr as the SB implies 
the willingness of all players to follow its recommendations in every position. 

Internal stability requires the consistency of outcomes recommended by cr. 
TOSS requires that, in addition, the decision to exclude certain outcomes 
from cr be not arbitrary. (Note that by never recommending any outcome, 
no inner contradictions might arise. That is, a standard of behavior a such 
that cr(G) = 0 for all G G F is always internally stable.) That is, for every 
position G G F, the SB cr must account not only for elements in cr(G) but also 
for those in Xq \ <^(G). TOSS maintains that the only reason for excluding 
an outcome x G Xq is that, were it included in cr(G), it would have been 
rejected by players who adopt the SB cr. Formally, TOSS insists that the SB 
cr be externally stable: for all G G F, a? G Ag\ct(G) implies that there exist 
a coalition 5 C Nq and a position H G 7(5 | G,x), such that 5 benefits by 
rejecting x and inducing iF, realizing that the solution to H is given by cr{H). 

The overall single consistency requirement imposed on an SB is that it be 
stable — both internally and externally. Thus, if an SB is stable, then the 
solution it assigns to each position contains those and only those outcomes 
that are not rejected by any coalition, whose members are aware of and 
believe in the specification of the SB. 

While intuitively appealing, if not compelling, there is, however, a technical 
difficulty in formally defining a stable SB. The expression “members of S 
prefer the set cr{H) over the outcome x G Ag” involves comparisons of a 
single outcome with a set of outcomes. This is a general difficulty which 
confronts any analysis of social environments in which players do not have a 
(subjective or an objective) probability distribution over the set of outcomes. 
It is important to note that this issue concerns players’ preferences and thus 
ought to be part of the description of the social environment. 
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Most of the analysis that employs TOSS has, to date, considered one of two 
extreme forms of players’ preferences: optimistic and conservative. ^ The 
assumption of optimistic behavior entails that members of S prefer cr(if) over 
X G Xq if there exists an outcome y E (^{H) that all members of S prefer 
to X. The other extreme assumption is that players behave “conservatively”, 
and S will reject a proposed outcome x in position G, and induce the position 
H G j{S I G, x), only if all outcomes in (t{H) make all members of S better 
off. Formally, this is expressed by the following definitions. Note that it is the 
individuals that are optimistic or conservative not the concept of stability. 

Definition 2.3.1. Let a be an SB for the situation ( 7 , F). We shall say that 
a is 

(i) optimistic internally stable for (t^F) if for all G E T, x E o-(G) 
implies that there do not exist a coalition S C Ng, a position H G 7(5 | 
G, x), and an outcome y G (^{H) such that for all i G S, u^iy) > Uq(x). 
(ii) optimistic externally stable if for all G E T, x E Xq \ cr{G) implies 
that there exist S C Nq, H E j{S | G,x), and y E (^{H) such that 
u'jj(y) ^ i E S. 

(in) optimistic stable standard of behavior (OSSB) if it is both opti- 
mistic internally and externally stable. 

Definition 2.3.2. Let a be an SB for the situation ( 7 ,F). The SB a is: 

(i) conservative internally stable for ( 7 ,F) if for all G E T, x E o-{G) 

implies that there exist no S C Nq and H E j{S | G,x) such that 
c{H) / 0, and for all y E ^ 

(ii) conservative externally stable if for all G ET, x E Xg\ct(G) implies 

that there exist S C Nq and H E y{S | G, x) such that (r{H) ^ 0, and 
for all y E o'iH), > '^g(^) ^ 

(Hi) conservative stable standard of behavior (CSSB) if it is both con- 
servative internally and externally stable. 

Remark 2.3.3. The terms “standard of behavior” and “stability” have been 
borrowed from von Neumann and Morgenstern’s (1947) seminal work. How- 
ever, the formalism, motivation, and rationale of these two approaches are 
quite different. A situation provides the relevant details about the negotia- 
tion process. It turns out, as was pointed out by Shitovitz (Greenberg, 1990, 
Theorem 4.5), that the OSSB can be formally derived from a von Neumann 
and Morgensiern (vN&M) abstract stable set of an “associated abstract sys- 
tem”. This result should not be misinterpreted to imply that the OSSB, or 
more generally TOSS, can be identified with vN&M’s notion because of the 
following two reasons: 

^ Though these assumptions yield, as we shall see, many interesting results, the investi- 
gation of alternative (and perhaps, more plausible) behavioral assumptions may yet prove 
to be much more valuable. 
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(i) The standard of behavior that results from any behavioral assumptions 
other than optimism cannot be formally derived from a vN&M abstract 
stable set. ( For example, CSSB’s may be nested in contrast to the fact 
that OSSB or vN&M abstract stable set can never be nested.) 

(ii) Even if individuals are optimistic, different situations (with different 
negotiation processes, beliefs, and institutions) might yield the same 
abstract system. That is, the solutions (captured by the OSSB or the 
vN&M abstract stable set) for these situations coincide, but not the 
underlying data. 

2.4 A Few General Results 

I shall now mention some general properties of OSSB and CSSB. ® First, 
there exist situations that: (i) admit both an OSSB and a CSSB, (ii) admit 
neither an OSSB nor a CSSB, (hi) admit either an OSSB or a CSSB but 
not both. Second, in many situations the OSSB and the CSSB coincide. 
These situations include all those that represent a social environment where 
the negotiation process is such that at any stage a specific outcome (“bill / 
proposal”) is being considered. This “status quo” can, in turn, be replaced 
(“amended / countered”) by another outcome. Third, a large class of situa- 
tions, called strictly hierarchical situations^ admit a unique OSSB as well as a 
unique CSSB, for which explicit formulae can be provided. Loosely speaking, 
strictly hierarchical situations are those situations that can be represented 
by a finite acyclic directed graph. Many situations that represent games or 
economic models of particular interest are strictly hierarchical. 

Observe that, by external stability, if cr is either an OSSB or a CSSB for a 
situation ( 7 ,F), then there exists at least one position G G F with cr{G) / 0. 
We shall say that the SB (t is nonempty-valued if cr{G) ^ 0 for all G G F. 
If a and r are two SBs for a situation ( 7 ,F), we say that a includes r if 
r(G) C cr(G) for all G G F. For CSSB, Greenberg, Monderer, and Shitovitz 
(1993) proved the following general result. 

Theorem 2.4.1. Let ( 7 ,F) be a situation and let E he the set of all con- 
servative internally stable nonempty-valued SBs. If E is nonempty^ then it 
admits a largest element, with respect to the inclusion order. Moreover, 
is the largest nonempty-valued CSSB. 

Corollary 2.4.2. If a is a nonempty-valued OSSB, then there exists a CSSB 
that includes it. 

As was discussed in Remark 2.3.3, an OSSB can be formally derived from 
a vN&M abstract stable set of an “associated abstract system”. This result 
enables us to apply results from works on abstract stable sets to TOSS. Using 
one of these results, Shitovitz proved the following theorem. 

® As stated in the introduction, for precise statements and proofs of results quoted in this 
lectmre, see Greenberg (1990a), unless otherwise specified. 
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Theorem 2.4.3. Let ( 7 , F) be a situation such that F contains a finite num- 
ber of positions^ and each position G E F contains a finite number of out- 
comes. Assume, in addition, that the inducement correspondence is such 
that for all G, H E F, x E Xg and S C No, if H E y{S | G, x) then Nh = S. 
Then, ( 7 , F) admits a unique OSSB. 

Another implication of the formal relationship between OSSB and abstract 
stable set is that, in contrast to CSSB, it is impossible for one OSSB to 
include another. 

3. TOSS and Cooperation 

3.1 Cooperative Games 

As was mentioned in subsection 1.1, the (optimistic or conservative) stable 
standards of behavior for situations that represent a cooperative game yield 
some of the better-known solution concepts (such as the core and the vN&M 
solution) as well as offer new concepts (such as the stable bargaining set). 
In addition to “bridging” classical game theory and TOSS these results also 
highlight the underlying negotiation processes, thereby enhancing our under- 
standing of these solution concepts, and perhaps making us more critical of 
them. 

For example, it can be shown that the only candidate for an OSSB for 
the situation that represents the negotiation process whereby a coalition S 
can induce the subgame (S,vs) [outlined in (3) of subsection 1 . 1 ], is the 
core mapping. However, not every game admits an OSSB. This result points 
out the following deficiency in the definition of the core. The core of a 
cooperative game (N, v) contains all those payoff vectors in v{N) that are 
not blocked by any coalition S, using any payoff vector in v{S), including 
a payoff that can, in turn, itself be blocked. But if we are to consider only 
payoffs that cannot be blocked, (i.e., that belong to the core), then the same 
property should be required of the blocking payoffs. Using the terminology 
of TOSS, the definition of the core lacks “external stability”: outcomes that 
are not recommended (in this case, payoffs that do not belong to the core) 
must be those (and only those) that if offered would be rejected given the 
recommendation for the “induced 'positions^’ (in this case, the core of the 
subgames). 

The external stability of the core is interesting on its own right. Moreover, 
as mentioned above, external stability of the core implies the existence of 
OSSB. The following are two important classes of games in which the core 
mapping is externally stable. 

(i) In every game where the set of players is finite, the core mapping is an 
OSSB (see Ray 1989 and Greenberg 1990). 
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(ii) In a variety of mixed economies with “large” (atoms) and “small” (atom- 
less) traders, the core mapping is an OSSB (see Mas-Colell 1989, Green- 
berg and Shitovitz 1994, Einy and Shitovitz 1994, and Shitovitz and 
Weber 1994). 

These results naturally raise the following open question: 

Let (Nyv) be a cooperative game (with an infinite number of players) with 
the property that each of its subgames admits a nonempty core. Is the 
core correspondence an OSSB for the core situation? That is, is it true 
that every payoff outside the core is blocked by a payoff in the core of some 
subgame? 

The fact that the OSSB for the situation representing the negotiation pro- 
cess (6) in subsection 1.1 yields the vN&M solution lead to another interest- 
ing open question: in which games does the core coincide with the {unique ) 
vN&M solution? Here is a partial answer to this question. We shall say that 
a game has the “extension property” if for every payoff x in the core of a 
subgame (S^vs) there exists a core payoff y whose restriction to S coincides 
with X, i.e., X = 2 /*^. It is easy to verify that if the extension property holds, 
then the core coincides with the (unique) vN-M solution if and only if the 
core mapping is optimistic (externally) stable. Convex games is one class 
of games that has this property. Another class was recently discovered by 
Einy, Holtzman and Shitovitz (1994), who proved the following result: Let 
be non-atomic probability measures on (T, E) which are abso- 
lutely continuous with respect to /i, where T = [0, 1] is the set of agents, and 
E is the set of Borel subsets of T, representing the set of admissible coalitions. 
For each 5 G E let v{S) = min{/ii(5), )U2(5), ..., ^n(5)}. Then, the core of 
(A, v) is the unique vN-M stable set. 

Another open question is: Does the stable bargaining set [ see negotiation 
process (7) in subsection 1.1.] contain (only ) Pareto optimal payoffs? 

In addition to these specific open questions, it is also interesting to inves- 
tigate the (optimistic, conservative, or any other type of) stable standard of 
behavior for different negotiation processes, as well as to find a “plausible” 
negotiation processes that yield known solution concepts such as the Shapley 
value. 

3.2 Cooperation in ^^Noncooperative Games” 

When TOSS is applied to either normal or extensive form games, the OSSB 
often yields “cooperative outcomes” . The following are a few examples that 
demonstrate this phenomenon. 

(1) The OSSB in the situation associated with a game tree often yields ap- 
pealing refinements of subgame perfect equilibria that involve cooperation. 
This is clearly illustrated by the following example of “retrospective voting” . 
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There is a single voter, player 1, who can vote for either candidate 2 or can- 
didate 3. The candidate that player 1 elects becomes the incumbent, and 
must then choose a policy from the set {a, 6}. After the incumbent chooses a 
policy, the voter must vote again for the candidate who will be the incumbent 
in the next period. The voter’s preferences are a function only of the policy 
selected, and he gets one unit of utility every time a is chosen, and 0 every 
time b is chosen. The candidates get utility only from being elected, obtain- 
ing one unit of utility whenever they are elected. This game proceeds a finite 
number of periods. An important feature of this model is that candidates 
cannot commit themselves before the election to adopting particular policies. 

Inherent to this game is that the decision made by the voter affects the 
utility of the candidate, but not the utility of the voter, and similarly, when 
the candidate adopts a policy, it affects the utility of the voter, but not the 
utility of the candidate. It is for this reason that, as is easily verified, there 
are many subgame perfect equilibria in this game [as well as many ‘‘trembling 
hand” (Selten 1975) and “stable” (Kohlberg and Mertens 1986) equilibria]. 
In fact, every path in the game tree is supported by a subgame perfect equi- 
librium. Clearly, the “plausible” equilibrium is where the voter “rewards” 
the candidate who chooses the policy a, and “punishes” the candidate who 
implements policy 6, thereby inducing the candidate to select the preferred 
policy, in anticipation of being reelected. These equilibria result in paths 
which always give the voter maximum utility, and the initial incumbent al- 
ways gets reelected. Quite remarkable, these are precisely the paths assigned 
by the unique OSSB for the associated situation. Winer (1989) shows that 
this characterization holds also for the infinite case. 

The appeal of the refinement provided by the OSSB for game trees is also 
demonstrated by Tadelis (1994) who shows that for a large class of games, 
that include all games with common interest, the “non- discriminating” OSSB 
of the infinitely repeated extensive form game yields only Pareto optimal 
outcomes. 

An interesting open question is the existence of an OSSB in ^‘continuous 
game trees”. A partial answer to this question was recently provided by 
Shitovitz (1994). 

(2) TOSS is particularly useful when players are allowed to negotiate or dis- 
cuss the course of actions they wish to take. This feature of TOSS has already 
been extensively applied. For example, in normal form games, the OSSB for 
the situation that represents the negotiation process given in (2) of subsec- 
tion 1.2 yields the notion of “coalition proof Nash equilibrium” (CPNE). 
This characterization enables us to extend the definition of CPNE to games 
with an infinite number of players. [The original definition, due to Bernheim, 
Peleg, and Whinston (1987) is recursive.] This extension was recently used 
by Alesina and Rosenthal (1993) who study a model with a continuum of 
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voters and use the CPNE to explain the phenomena of “mid-term electoral 
cycle”, “split-ticket voting”, and “divided government”. Khan and Mookher- 
jee (1993) use our characterization to study the CPNE in insurance economies 
with incomplete information. 

Among the many open questions in this area is finding general conditions 
for the existence and uniqueness of OSSB in the “CPNE situation'' [described 
in (2) of subsection 1.2]. 

Another situation that can be associated with a normal form game is the 
“contingent threat” situation, described in (3) of subsection 1.2. It captures 
“open negotiations” among players who can make “tender threats”. Muto 
and Okada (1992) study the OSSB for this situation in a duopoly market with 
quantity- competition. They show that if the two firms can form a coalition 
(but cannot sign binding agreements) then outputs supported by the OSSB 
yield higher profits than those of the Cournot-Nash level. Another application 
of the OSSB for the open negotiation process was explored by Muto and 
Nakayama (1991) in the context of “resale-proof” trade of information when 
communication and non-binding agreements between players, as well as resale 
of information, are allowed. Resale-proof trade is obtained as the unique 
stable SB of the situation that describes an open negotiation process. 

(3) TOSS also provides new “cooperative concepts” within the context of 
“noncooperative games”. For example, Asheim. (1988, 1990) derived the no- 
tion of renegotiation proofness from the OSSB of the following situation: in 
every subgame of the original (infinite) game tree, in addition to deviations 
by single players, the grand coalition can “reconsider” its course of actions. 
That is, the entire set of players, TV, can induce (“recommend”) any path in 
any subgame (position). 

Another concept that can naturally be defined using TOSS is that of “far- 
sighted behavior” . The way in which individuals view their alternatives and 
the consequences of their actions is captured by the set of alternatives at 
each position and the inducement correspondence. Chwe (1992) and Xue 
(1993) study coalition formation of farsighted individuals in social environ- 
ments with diverse coalitional interactions. 

(4) TOSS allows also the analysis of social environments that cannot be 
represented as “games”. For example, there are social environments in which 
players may be forced (because of, e.g., legal, historical, social, or ethical 
considerations) to restrict their actions so that, for example, the resulting 
outcomes be Pareto optimal. That is, only Pareto optimal outcomes can 
be considered, and similarly objections to a proposed outcome cannot be 
based on deviations to non-Pareto outcomes. But the set of Pareto optimal 
strategy profiles is, in general, a strict subset of the set of all a-priori possible 
profiles, and, moreover, it cannot be represented as a Cartesian product of the 
individual strategy sets, and therefore, cannot be represented as a (normal 
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form) game. Greenberg, Monderer, and Shitovitz (1993) have analyzed such 
environments using the notion of “multistage situations” . Interestingly, they 
show that the unique CSSB in the repeated prisoner dilemma when only 
Pareto optimal paths are “admissible”, consists only the two extreme non- 
cooperative paths. 

(5) Using TOSS Greenberg, Monderer, and Shitovitz (1993) defined the no- 
tion of “fc bounded rationality”. This notion captures the fact that players 
“look ahead” at most k periods, (where k is less then the length of the 
game). Thus, players will not deviate from a path agreed upon if and only if 
they cannot benefit from a deviation in the next k periods. Of course, when 
deviating, players consider only paths that are “Ar-rational” in the induced 
subgame. Greenberg, Monderer, and Shitovitz (1993) show that if players 
are “conservative” , “A: boundedly rational” , and their discount factor is close 
to 1, then it is possible to have {full) cooperation in the finitely repeated 
Prisoner’s Dilemma game. 

(6) My last example in this lecture concerns negotiations in extensive form 
games with imperfect information. Consider the “peace-negotiation” example 
in Figure 3.2.1. 



1 




9 0 9 0 

1 0 0 1 

Figure 3.2.1 

Each of the two warring countries, 1 and 2, has to decide whether or not 
to reach a peace agreement, represented by the path (bd). Failing to reach 
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an agreement, country 3 would “re-evaluate” its policy - a decision that 
will affect both countries 1 and 2. Assume that country 3 has no way to 
know which of the two countries caused the break-down of the negotiations 
(otherwise, it could threaten to retaliate against that country). All it knows 
is whether or not the negotiations are successful. As the payoffs indicate, it 
is in the best interest of country 3 that the two warring countries sign the 
peace agreement. Both countries 1 and 2 may (correctly) anticipate the set of 
“plausible/rational” re-evaluated policies country 3 can take if no agreement 
is reached. As country 3 cannot know who is responsible for the break-down 
of the peace negotiations, both policies L and R are “rational” . [L is the best 
policy for 3 if it is country 1 that refused the treaty, and R if it is country 2.] 
Therefore, unless country 3 pre- determines (or reveals in advance) the policy 
it would adopt should the peace treaty not be reached, countries 1 and 2 have 
no way to know {even probabilistically ) which policy would be adopted by 
country 3. It is, then, conceivable that each country will follow the path (6d), 
each because of different reasons: country 1 for fearing that policy L is more 
likely to be adopted than policy R^ and country 2 for fearing that policy R 
is more likely to be adopted than policy L. It is important to observe that 
if both countries held the same beliefs on the likelihood of the adoption of 
policies L and R^ at least one of these two countries would find it in its best 
interest to jeopardize the peace talks. 

The success of the peace mediation between Israel and Egypt (players 1 and 
2) by the U.S. (player 3) following the 1973 war, may be, at least partially, 
attributed to such a phenomenon. Egypt and Israel were each afraid that if 
negotiations broke down, she would be the looser. 

“And once a negotiation is thus reduced to details, it has a high probability 
of success - unless one party has consciously decided to make a show of 
flexibility simply to put itself in a better light for a deliberate breakup of the 
talks. Egypt was precluded from such a course by the plight of the Third 
Army, Israel by the fear of diplomatic isolation. The odds favored success, 
even though major differences remained.” (Kissinger, 1982, p. 802.) 

However, no Nash equilibrium for this game supports the path (bd). In fact, 
this game possesses a single Nash equilibrium, which is given by: Player 1 
uses the mixed strategy (^ a,| 6), player 2 uses the pure strategy c, and player 
3 uses the mixed strategy (^ R). Indeed, the notion of Nash equilibrium 

in strategies implies “commonality of beliefs”: players have exactly the same 
beliefs concerning other players’ actions, including those “off the equilibrium 
path” (which are, therefore, unobservable). In contrast, TOSS enables us 
to use of the notion of paths rather than strategies, thereby allowing for 
the possibility that off the (“stable path”) players may have different beliefs 
concerning the actions other players might take. Thus, players agree to follow 
a course of actions, each for his own reasons. “Stable paths’’ (Greenberg 
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1994a) are those that would be followed if recommended to rational players 
(at the beginning of the game). In the “peace negotiation” example above, 
(bd) is a stable path. 

The above analysis and the notion of “stable paths” suggests a new avenue 
of research for stable outcomes in situations where there is a “single public 
recommendation” . (This recommendation may represent the existing social 
norm, past obsefvations, incomplete contracts, etc.) Recall that the definition 
of stability entails that if is a stable standard of behavior, then a (publicly 
recommended) outcome x G Xq does not belong to cr(G) if and only if there 
exist a coalition S C Nq and a position H G j{S [ G, x) such that S “prefers 
cr{H) to a:” ^ . Thus, it is implicitly assumed that once position H is induced, 
a new recommendation [from cr{H)] will be ’publicly made. While this may 
well be the case in many social environments, such a procedure does not 
prevail in general. It is, therefore, interesting to investigate stable outcomes 
in social environments where a public recommendation is made only in one 
(“the initial”) position, and no new recommendations are publicly made in 
subsequent induced positions. 

4. Concluding Remarks 

I argued that none of the three types games provides complete description of 
social environments. It is for this reason that the disparate solution concepts 
involve (often implicit) assumptions which should be part of the description 
of social environments. More importantly, there are social environments that 
cannot be represented by any of the three types of games. The theory of so- 
cial situations amends these deficiencies. In particular, it olfers an integrative 
approach to the study of social environments with diverse coalitional inter- 
action. A situatio’n provides a complete and unified description of a social 
environment while the solution concept uses stability as the sole criterion. 
I have shown that the flexibility and merits of the theory through several 
applications. There are many interesting open questions concerning both ap- 
plications and theoretical issues. Some of the possible directions, in addition 
to those mentioned in the previous sections, along which the theory could be 
extended are: analyzing situations with behavioral assumptions other than 
optimistic or conservative; designing more pertinent institutions and negoti- 
ation processes; studying persisting social norms (or incomplete agreements) 
where players do not share the same beliefs “off the equilibrium path” ; and 
exploring new variants of bounded rationality that constrain not only on the 
computational but also the perceptual abilities of the players. 



^ We ignore the difficulty of formalizing this clause and assume that it is well-defined, 
either by optimistic or conservative behavior, or by assigning some probability distribution 
over outcomes in 

In the sense that every player in H is aware of this recommendation. 
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Introduction 



Cooperative game theory deals with feasible outcomes, whereas non cooperative 
game theory is concerned with strategic equilibrium. Repeated games provide a 
bridge between these two theories: folk-theorem-like results deal with the rela- 
tion between feasible payoffs in a one shot game and equilibrium payoffs in the 
corresponding repeated game. 

Moreover, in repeated games new strategic objects appear (cooperative plans, 
threats...) that are closely related to cooperation. More precisely, through rep- 
etition all individually rational Pareto efficient allocations can be achieved as 
equilibrium payoffs. 

For instance, the infinitely repeated prisoner’s dilemma has equilibrium strate- 
gies that lead to the equilibrium path where both players behave friendly. 

This lecture will be basically divided in two main sections: the first one focusing 
on the standard signalling case, where players know all past history, and the 
second one dealing with general signalling, where players get only private signals 
about the past. All the games we consider are with complete information. 

This area is currently very active. We would like to refer to the following surveys 
for further results and points of view: [4], [9], [24], [26], [28], [33], [34]. 

1 standard signalling 

1.1 The model 

We deal only with finite games. 

G = (W, (S^)ieN^ 9 ) is a normal form game defined by: 

- iV, a finite set of players {N = {1, ...,m}). 

- For any i in W, S'^ a finite set of pure moves. Let S denote IlzGiv 



^ Notes written by Dinah Rosenberg and revised by the author (December 1994). 
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- g, the payoff function : g'^(s) is z’s payoff if the players’ independent moves 
result in the profile s = (s^, . . . , s'^). 

We extend the definition of g to mixed and correlated strategies. If (7 is a 
probability distribution on 5, g{a) is defined by: g{a) = g{s)da{s). 

The repeated game associated to G is played in stages: 

Stage 1: Each player i chooses a move in s = is an- 

nounced. 

Stage n: Let Sk stand for the profile of moves at stage fc, (s^, ... ,5^). 
(Subscripts stand for time and superscripts for players). Denote by hn = (si, 
. . . ,Sn-i) the history available at stage n. Knowing hn each player i chooses 

4 in S\ 

A play is an infinite history, we denote by H the corresponding set and by Hn 
the set of histories available at stage n. 

Given a play s = we have a well defined sequence of payoffs gj = 



Payoffs 

We can define different games according to the way we evaluate the stream of 
payoffs: 

a) the finite game Gn' the payoff is the arithmetic average of the payoffs of 
the n first stages: 7n = (^i + . . . 4- gr^jn. 

b) the discounted game G\\ the payoff is the geometric average of the se- 
quence of payoffs: A, for Ag(0, 1]. 

Since we are interested in long games, we will consider asymptotic properties: 
n oo, A — > 0. 

c) the infinitely repeated game G^o • the payoff is some limit of the sequence 
7 ^; one may choose liminf, or limsup, or some Banach limit. The choice of the 
limit we consider is arbitrary, so there is no well defined intrinsic payoff. 

Strategies 

We define the same set of strategies for the three games in order to compare 
them. In fact we consider that there is only one repeated game form, and that 
Gn, G\ and Goo are three evaluations of outcomes in that game form. 

A pure strategy for player Hs a sequence (0|, . . . , 0^, . . .) with : Hn S'L 
A mixed strategy is a probability distribution on pure strategies. 

The games are with perfect recall, so we can use Kuhn’s theorem (and its ex- 
tension to the infinite case by Aumann), that allows us to consider only be- 
havioral strategies, where a behavioral strategy for player i is a sequence = 
(aj, . . . , < 7 ^, . . .), with (7^ being a function from Hn to A(5^).(A(X) is the set of 
probability distributions over X, for any finite set X). 

Remarks: 

- Since strategies depend on histories, we have used the standard signalling hy- 
pothesis to define them. 
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- Altogether, Gn is a finite game, i.e. for any n the set of pure strategies is 
finite: there is no problem of existence of equilibria. 

G\ has compact sets of pure (hence mixed) strategies for each player, and con- 
tinuous payoff functions: there is no problem of existence of equilibria either. 
In order to avoid the problem of definition of the payoffs in Goo, one will deal 
with uniform equilibrium in the following sense: 

Definition: x G is a uniform equilibrium payoff if Vc > 0, 3 AT, 3a such 
that Vn > AT cr is an e-equilibrium in Gn with payoff within e of x. 

We have to introduce a few more definitions and notations before we state the 
results. 

Definition: cr is a subgame perfect equilibrium of the game T if it is a strategy 
profile such that for every history A, a[h] is an equilibrium of the subgame of F 
starting after A, where a[h] is defined for all history A' by a[h]{h') — a{hh'), 
and A A' stands for A followed by A'. 

Notations: 

- D, the set of feasible payoffs, is the convex hull of ^'(S). 

- En (resp. E\, Eoo) is the set of equilibrium payoffs in Gn (resp. G\, Goo)- 

- E!^ (resp. E'^, E'^) is the set of subgame perfect equilibrium payoffs in Gn 
(resp. Gx, Goo)- 

- The individually rational level for player i is defined by: 
v'^ = min^-i max^i gi{a~'^ ,a'^), 

where the maximum and the minimum are taken respectively over the set of 
mixed strategies of player i and of his opponents. 

A payoff X is individually rational if for all i in N, >v'^. 

V = {v^,. . . ,v'^) is the threat point We denote by IR the set of individually 
rational payoffs. 

- F is the set of individually rational payoffs that are in D. 

Example 1: The battle of the sexes. 

The game is represented by the following matrix: 

A2,1) (0,0) \ 

V(0,0) (1,2) J 



The threat point u is (2/3, 2/3). 

D is the set: {(x, y) / {2y — x) > 0, {y — 2x) <0, x 4- y — 3 < 0}. 
The following picture shows the point v, the sets D, IR, F. 
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D 



IR 



F 



The set of payoffs achievable in Gi, Pi, is also represented, and one can see that 
it is strictly smaller than D. 

In fact Pi = {(x, y) / -\-y^ — 2xy — 2x/3 — 2y/3 + 1 > 0} fi P. It is easy to 

check that (3/2, 3/2) is not in P\. 



1.2 The Results 

What kind of results are we looking for? A “folk theorem” like result is a state- 
ment of the kind “limP^ = P” with h G {n,A,oo}, and with standing for 
P or P', expressing the fact that asymptotically, all individually rational and 
feasible payoffs are sustainable by equilibrium strategies. This kind of results is 
quite striking because P depends only on the one shot game. In fact, both the 
set of feasible payoffs, P, and the individually rational level are parameters of 
the one shot game. 
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The meaning of this result in terms of cooperation is the following: as all points 
in F are achievable as equilibria, this theorem relates a cooperative notion (fea- 
sibility) to a non cooperative one. The theorem implies in particular that all 
Pareto optimal payoffs in I R are achievable as equilibrium payoffs, hence that 
there are Pareto optimal equilibrium payoffs. 

The main results are summed up in the following table; each row concerns the 
folk theorem in one case {Gn, G\, or Goo), one column concerns the theorems 
of the form E = F, and the other one the theorems of the form E' = F. If the 
result is true under some condition, it is mentioned in the corresponding cell. 





Nash Equilibrium 


Subgame perfect equilibrium 


Goo 


folk theorem (~1960) 


Aumann and Shapley [5] 
Rubinstein [27] 


Gx 


Sorin [32] 
under condition A 


Pudenberg and Maskin [13] 
under condition B 


Gn 


Benoit and Krishna [7] 
under condition C 


Benoit and Krishna [6] (pure strategies) 
Gossner [14] (mixed strategies) 
under condition D 



The conditions are the following: 

Condition A: there is an element f in F such that for all i in N, p > 
Condition B: F has a non empty interior. 

Condition C: For all i in N there exists e{i) in Ei such that e{iy > 
Condition D: for all i in N there exists e(i) and f{i) in E\ such that e{iy > 
/(i)% and F has a non empty interior. 

As E' is included in JEJ, condition B is obviously stronger than condition A, and 
condition D than condition C. 

Moreover it is clear that condition D is stronger than condition B and condition 
C than condition A (because any Nash equilibrium belongs to F). This was not 
obvious a priori. 

In the following we will try to give the main ideas involved in the theorems and 
the sketch of some proofs. 

In all the proofs there are two inclusions. One of them is easy : 

Lemma 1.1: If a stands for oo, n, or A, 

E'^ is included in Ea- 
Ea is included in F. 

Proof: 

Each subgame perfect equilibrium is a Nash equilibrium, which proves the 
first statement. 




174 



The random one stage payoff takes its value in g{S) which is included in D 
and hence, as D is convex and closed, expectation, average, and limits are in D 
also. So Ea is included in D. 

To show that any equilibrium is in IR, we use the standard signalling hy- 
pothesis. 

If X is such that we construct a deviation for player i from a strategy 

(the profile <j leading to the payoff x), that gives him at least at each stage. 
This will contradict the fact that x may be an equilibrium payoff. 

If is the vector of mixed moves of the players other than i at stage one, 

by definition of i has a best reply, 5^(1) which satisfies: s“^(l)) > 

At each stage, i knows the previous history (here we use standard signalling). 
So, knowing the strategy of the other players, i can deduce their mixed moves 
at that stage: in fact the other players’ strategies take into account their in- 
formation, say hn at stage n, to choose their moves, cF~^{hn)\ as i knows this 
information, he can compute the moves if he knows the strategies. So he can 
play a one stage best reply that gives him at least ■ 

To show the reverse inclusion, two main ideas are used: plan and threat. 

A plan is a play associated to / in F, i.e. that leads to this payoff. It is seen 
as a cooperative device, from which players can eventually deviate. Note that 
the play is a sequence of pure moves so that potential deviations are observable 
under standard signalling. 

A threat is a profile of strategies that gives to one player a “bad” payoff, what- 
ever he does. The maximal threat against player i leads to a payoff of at most 
v'^ for him. 

To prove that a payoff is an equilibrium payoff, it will be enough to find a plan 
that gives this payoff, and a threat that is sufficient to prevent deviation from 
this plan. It is conceived as a punishment that is applied to a deviating player. 
In the case of subgame perfect equilibria, the threat must be an equilibrium 
strategy in the subgame starting after the deviation. Hence it is more difficult 
in that case to find good threats. You have to reward a non deviating player 
who has punished a deviator. 

In discounted games, the present has more impact than the future, so some 
threats may not be effective. Punishment is more difficult, but as A — > 0, it 
becomes possible to sustain as equilibrium payoffs all vectors in F. As time 
counts, postponing threats and rewards is more difficult compared to the undis- 
counted case. 

In finite games, some backward induction effects may appear. This means that 
at the last stage, the players play a one shot game. This is due to the fact 
that the end of the game is the same for everybody, and that at that stage it is 
public knowledge that it is the end of the game. This effect would not appear 
if the players did not know the last stage of the game or if the end of the game 
was not the same for everybody, see the chapter of professor Neyman about 
“Cooperation through repetition” in these proceedings. 
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We will give a more detailled proof of = F, and some ideas of how 
conditions A, B, C, D are used to have the corresponding results. 

Result 1: Eqq = F is the so-called “Folk Theorem”. 

It is obviously implied by the following: 

Result 2: F^ = F.(see Aumann and Shapley [5], and Rubinstein [27]) 

This result was proved in the 70 ’s, for more details, see Meggido, [23] p.O. 
Sketch of the proof of F C F^: 

Let / be in F, we want to show that / is a subgame perfect equilibrium payoff. 
/ is a convex combination of m + 1 elements of g{S) (by Caratheodory’s theo- 
rem), let us write / = Yha l^otfa^ with /x = (/i^) in the simplex of and fa 

in g{S). 

If the coefficients are rational, there is a plan p where the players play the 
pure moves leading to fa with frequency /i^, that gives payoff /. If people are 
supposed to follow such a plan, they should play pure moves at each stage, and 
so, deviations are observable. One can find such a play in a uniform way, i.e. 
such that after any history, the frequency of each move in the future remains 
the same. 

Let the strategies be the following : play as indicated by the plan p, and if you 
see a deviation by player i punish him, during a determined finite number of 
stages, such that any one stage gain by deviating at stage n is inferior to the 
loss due to the punishment up to 1/n. The payoff of a deviating player being 
the limit of the average payoff over n stages, for a deviation to be profitable, a 
player has to deviate at infinitely many stages. There is no gain by deviating if 
the gain per deviation becomes smaller and smaller when the deviation is made 
later. Finally a deviation during the punishment phase is ignored, but it is never 
profitable since it lasts for finitely many stages. 

If many players deviate choose the first one in some order, and punish him. 

So, there is a plan giving / as its payoff, and a threat that makes any deviation 
unprofitable. As the punishment is of finite length, applying the threat is an 
equilibrium strategy in the subgame starting after a deviation (it doesn’t affect 
the punisher’s payoff). Hence / is a subgame perfect equilibrium payoff. 

If the coefficients are not rational, approximate them by rationals: if / = 
Ylot foc'> where some pa are irrational, define a sequence of rationals = 

( ) converging to /x. The sequence (/’^ = Xla ^afa)n converges to /. For any 
n let be the common denominator of the /x^. Define the following plan: play 
so as to get fa with frequency (/x^) during stages, then fa with frequency 
(/x^) during stages and so on... The threat is the same, these strategies lead 
to a subgame perfect equilibrium. ■ 
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We will now comment on the conditions A, B, C, D. We are going to sketch the 
proofs showing why the folk theorems are true under these conditions and to 
explain how they are used. 

The convergence results are always considered to be under the Haussdorff topol- 
ogy. We want to prove that any vector in F is the limit asA^Oorn— >ooof 
vectors in (with the notation we already used above for a and b). 

Result 3: limA-.o^A = F. (see Sorin [32]) 

Condition A: there is an element f in F such that for all i, p > 

This condition implies that for any f in F there is an /' in F close to / such 
that f '^ > for all i. 

So one can assume that one starts with an / in F such that for all z, p > 

Let us choose a plan that achieves f as n payoff in E\ for A small enough, in 
such a way that after any history, the future payoff is within e/2 of /. (The 
future payoff may not be equal to /, but for A small enough it can be as close 
as possible to /). The strategies are to follow this plan and to punish a deviator 
i by decreasing his payoff to the level uL If min^eiv(/^ — v'^) is bigger than 
e, then one can punish any deviation by decreasing the future payoff of the 
deviating player by at least (1 — A)e/2. As soon as Aa < (1 — A)e/2 with a being 
the maximum one shot gain from deviating, this threat is enough to prevent 
deviation, because it makes it unprofitable. 

So, there is a A* such that for all A < A*, / is an equilibrium payoff of E\, hence 
the result. ■ 

An example due to Forges, Mertens, and Neyman ([8]) showing that the folk 
theorem may not be true if condition A is not satisfied, is of the following kind: 
Consider a three player game, where player 3 is a dummy and players 1 and 2 
are active players; the payoffs are given by the following matrix: 

/( 1 , 0 , 0 ) ( 0 , 1 , 0 )\ 

V (0,1,0) (1,0,1); 

There is a unique equilibrium payoff of the one shot game: (1/2, 1/2, 1/4) 
(players 1 and 2 both playing (1/2, 1/2)). 

The point (1/2, 1/2, 1/2) is in F. But on any path that leads to this payoff, one 
of the active players has a profitable one shot deviation (a one shot deviation 
is valuable because the future is discounted), and that no threat can prevent 
him from deviating because the maximum threat gives him 1/2, which is the 
original equilibrium payoff. In fact, E\ is reduced to (1/2, 1/2, 1/4), and (1/2, 
1/2, 1/2) cannot belong to the limit of E\ . 



Result 4: limA-^o^A “ Fudenberg and Maskin [13]) 
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Condition B: F has a nonempty interior. 

Condition B tells us that for all / and all 6, there is an element /’ of F such 
that II/ — /'II < 6, and /’ is in the interior of F. So it is enough to prove the 
result for / in the interior of F. 

The idea is that it may be costly to punish somebody, so you must get a reward 
if you do so, because we are looking for perfect equilibria. But the reward the 
punisher gets must not also be a reward for the deviator. That is why a dimen- 
sion condition is needed. 

Moreover, the players must not be tempted to deviate during the punishment 
phase. But during that phase, the players play mixed moves, hence deviations 
may not be detectable: the other players don’t see the randomizing distribu- 
tion. In order to prevent deviation in that phase, you have to manage to give 
the punishing player the same payoff whatever moves he chooses during the 
punishment phase (at least in the support of the punishing strategy); hence he 
has no incentive to deviate. This is possible because the set of feasible payoffs is 
convex in the discounted game for A small, it would not be possible in a finitely 
repeated game. 

Definition: Let / be a vector in F and s a plan inducing / as its payoff, let 
h be any finite truncation of s of length r, then the continuation payoff a.fter h 
leading to / is a vector c{h) in IRF' such that 

/ = A^(l-A)'=-ip(s,) + (l-ArcW. 



Note that the continuation payoff after a history is in general different from /. 
For any / in F there is a plan s that leads to payoff /, for A small enough. Choose 
a smooth plan in the following sense: given any finite history, the continuation 
payoff has to be close to /. We are going to use the following lemma: 

Lemma 1.2: For any f in F and positive S, there exists a Ai such that for 
all X < Xi, there is a plan inducing f as its payoff, such that the continuation 
payoff after any history h is at a distance of at most 6/4 of f. 

Let us assume that / is at a distance 6 of the boundary of F and try to play 
the plan defined in the lemma. 

Remark: If you want to achieve a payoff / as subgame perfect equilibrium 
payoff, you have to prove that after any history, there is no profitable deviation. 
The lemma says that after any history, the continuation payoff for player i is 
at least ff — 6/4. Hence you have to prove that there is a threat that leads to 
a payoff of less than p — 6/4. Then you can punish a deviating player in the 
following way: first you give him his individually rational level for a sufficiently 
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long time, in order to cancel his gain from deviating, and then you switch to a 
plan that gives him less than p — (5/4. 

If there is a deviation by a player i, the other players punish him. We focus 
on punishments of the following kind: first the deviator’s gain by deviating is 
compensated, and then a new continuation payoff is fixed. 

More precisely, after a deviation of player i, the other players punish him (at 
his individually rational level) during R periods. R is a. fixed integer such that 
any one shot deviation of any player j is made unprofitable: by getting for 
R periods, the average for these + 1 stages will be at most +6 f2 for any A 

smaller than some A 2 . If the strategies specify the moves after these R periods 
such that the continuation payoff of the deviator i is less than or equal to the 
payoff under the initial plan, this one being at least + ^/2, then a deviation 
is unprofitable. 

In order to get a subgame perfect equilibrium, one has to check that a punisher 
has no incentive not to punish, and that there is no profitable deviation for any 
player after these R periods. 

For every history of length R, h, and for all i, define the vector payoff d[i]{h) 
such that playing h during R periods and then receiving d[i]{h) in the future 
gives, as a whole payoff -1-6 to player j ^ z, and d[i]{hy = -1-6/2 to player i. 

Note that for A small enough, one has also d[i]{hy > -f 6 - 6/4. If d[i]{h) 
can be implemented as a perfect equilibrium, no one has an incentive not to 
punish a deviator since whatever he does during the R periods of punishment 
he gets the same payoff. So it remains to prove that d[i]{h) can be achieved as 
a subgame perfect equilibrium payoff for any h. 

Let s[i]{h) be a plan inducing the payoff d[i]{h) and satisfying Lemma 1.2. In 
particular we want to prevent a new deviation of the same player z, after a 
history hh' (histories begin here after the first deviation). The problem is the 
following: after h the future payoff of z is u^-f-6/2, and after hh' it may be 6/4 
because of the remark we made above. We hence have to justify -f 6/4 as a 
perfect equilibrium payoff. But after some history, according to the same remark 
the future payoff of z may be uL After this history, no threat can prevent z from 
deviating. That is why one has to achieve -f 6/2 through a plan that gives 
to z his “bad” payoffs first, i.e. s[i]{h) is such that the continuation payoff for 
player z is always bigger than v'^ -h 6/2. Such a plan exists. 

We can now check that no deviation is profitable for any player at any time: 

- We have already seen that a deviation of player z is profitable, only if he 
deviates also after h. If not, his one shot gain by deviating is compensated during 
the first R stages after the deviation, and the continuation payoff afterwards is 
+ 6/2 <P- 6/4. 

Let us assume now that a deviation of player z has already occurred. 

In fact, we forget the plan leading to the payoff /, we switch to a situation 
where the players are supposed to follow the plan s[z](/z). 
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- During the first R periods after the deviation, i plays a best reply and gets at 
most hence there is no profitable deviation for him. 

- If i deviates after hh\ the other players will switch to the plan s[i](h). If he 
had followed the plan, his continuation payoff would have been greater than 

-f ^/2. After the deviation, his continuation payoff is 6/2. The threat is 
effective. 

- A player j ^ i has no profitable deviation during the first R stages after i’s 
deviation, because whatever he does, his overall payoff is + 6. 

- If player j deviates after hh\ the players punish him for R periods and then 
they switch to the plan s[j]{h'). His continuation payoff is then + 6/2, and 
this threat is effective because if he did not deviate his continuation payoff would 
be at least d[i]{hY —6/4:, which is greater or equal than 4- 6 —6/4 — 6/4. m 

Remark: If there is a public extensive form correlation device, i.e. if the players 
receive a public signal before each stage of the game, then, they can achieve 
/ through a plan such that for all history h, c{h) = f, hence the previous 
construction is much easier. 

There is an example of a game where this condition is not satisfied in [13], and 
the result is false, so a condition is indeed necessary. 

The example is the following game G: 

Player 1 chooses a row, player 2 a column and player 3 a matrix. The payoffs 
are given by the following matrices: 

/( 1 , 1 , 1 ) ( 0 , 0 , 0 )\ / ( 0 , 0 , 0 ) ( 0 , 0 , 0 )\ 

V (0,0,0) (0,0,0); \{0,0,0) (1,1,1); 

The threat point is (0,0,0). The players always have the same payoffs. So 
one can define w as the minimal payoff achievable through a subgame perfect 
equilibrium in G\- 

Let cr be a subgame perfect equilibrium. It is easy to see that at least one player 
can (eventually by deviating) get 1/4 at stage one. Since his future payoff is at 
least w, this implies that w is bigger than 1/4. 

Hence (0,0,0) cannot be approached by E'^. 

Result 5: limn-^oo^n = F. (see Benoit and Krishna [7]) 

Condition C: for all i there exists e(i) in E\ with e{iy > v'^. 

Some condition is needed: there are games where some feasible individually ra- 
tional payoffs cannot be achieved as equilibrium payoffs. In fact, in the following 
prisoner’s dilemma, En = {(1, 1)} for all n so the result does not hold. 
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Example 2: The prisoner’s dilemma is the two player game defined by: 

f(3,3) (0,4)\ 

V(0,4) (1,1) J 

This above property is due to the fact that Ei is reduced to the threat point. 
More precisely we have the general result [32] : 

Lemma 1.3: If Ei = {u}, then En = {u} for all n. 

Proof: As the game is finite, at stage n each player knows it is the last stage. 
So on the equilibrium path, at that stage the payoff will be in Ei, hence it will 
be V. Let us consider the longest history h (let p denote its length) of positive 
probability on the equilibrium path after which the one stage payoff is not in 
El. So one player, say i, has a one shot profitable deviation after h. Hence, 
since we are on an equilibrium path, there is a punishment after stage p + 1, 
that prevents player i from deviating. But, the payoff after stage p + 1 on the 
equilibrium path is already the worst possible fot him: there can be no threat. 
Hence the payoff will be v at any stage along the equilibrium path. ■ 

Notice that the failure of the folk theorem in this case is not due to the exis- 
tence of a unique equilibrium in strictly dominating strategies. In the following 
game there is a unique Nash equilibrium in strictly dominating strategies, but 
condition C holds and the result is true. 

/(3,1) (0,0) \ 

V(4,2) (1,0) ; 

In this case the sequence of moves (Top, Left) (Bottom, Left) is an equilibrium 
path in G 2 . The strategies of player 1 (resp. 2) are the following: play as 
recommended above, and if there is a deviation at stage 1, player 2 plays Right. 
Hence one can exhibit in this case an equilibrium payoff in E 2 , (^^, ^^) that 
is different from the payoff in Ei, (4,2), in spite of the existence of a unique 
Nash equilibrium in strictly dominating strategies. 

So one needs a condition to avoid that En = {u}, and we will show how to get 
the result under the sufficient condition C. 

The spirit of the proof is the following: 

- find a play of length T that leads to a payoff that approaches f in F within e. 

- the strategies are: 

first follow the plan that consists in playing the above play cyclically up to Rm 
stages before the end of the game. (Recall that m is the number of players). 
Then, play R cycles leading to (e(l), . . . , e(m)). This is crucial: playing equi- 
librium strategies at the end of the game excludes the possibility of profitable 
deviations at the end. It avoids the “backward induction” effect we have seen 
in the prisoner’s dilemma case. 
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If there is a deviation of player i during the first phase of the strategy, switch to 
strategies punishing him to v'^ for the remaining stages. The gain of deviating 
will never exceed R{e{i) — v'^) if is large enough. It means that if a deviation 
from the equilibrium strategies occurs far enough from the end of the game, you 
have time to punish a deviator. 

R is 3, constant, so as n goes to infinity, the presence of these last stages payoffs 
does not influence the limit payoff. ■ 

Result 6: limn-.oo -S'n ~ Benoit and Krishna [6] and Gossner [14]) 

Condition D: for all i there exist e(i) and f{i) in E\ such that e{iy > /(^)^ 
and F has a non empty interior. 

The proof uses both the ideas of the proof of Result 5 (add terminal stages 
where you play an equilibrium strategy, to avoid backward induction effects), 
and the ideas of the proof of Result 4 (rewarding the punishers, that is why we 
need the interior of F not to be empty). 

At the end of the game play cycles of (e(l), . . . , e(m)); in that part of the game 
there will be no profitable deviations. Before, play cycles of a path that leads to 
a payoff close to /. If there is a deviation punish during a finite number of stages, 
and then reward the punishers (as in the theorem about E'yf). If the deviation oc- 
curs too late to do all this, then, at the end of the game switch from cycles leading 
to (e(l), . . . , e(m)), to cycles leading to (e(l), . . . , e(z-l), /(z), e(i+l), . . . ,e(m)) 
if i was the deviator. If the lengths of the cycles are chosen appropriately this 
is a perfect equilibrium. 

The above procedure works if you can identify the deviator, i.e. if you restrict 
yourself to pure strategies (even in the punishment phase). If not, the proof is 
much more intricate, and requires a statistical test and a family of late adapted 
payoffs (see [14]). ■ 

Comments 

a) In two player zero sum games, is a segment and F is reduced to a single 
point, (u, —v)^ where v is the value of the game. 

b) For the two player case, the results hold under weaker conditions. The 
reason for this is that both players can minmax each other at the same time 
(see [13], [6], [7]). 

c) It is clear that if the players could correlate their moves, they could not 
achieve more payoffs since the set of correlated payoffs is the convex hull of the 
set of feasible payoffs. However, if the players could correlate secretely to punish 
somebody, the individually rational level could change, but this is not the case 
if the correlation signal is public. 
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d) If only pure strategies are allowed, is higher, so D may be smaller, even 
empty. It is the case in the following zero sum game: 




With mixed strategies, v = (1/2, —1/2), D = {(x, —x) , x G [—1, 1] }, hence 
F = {«}. 

With pure strategies, D is the same but v = (1,0), hence F = 0. 

e) If mixed moves are observable, and not only their realizations, the game is 
similar to a game with continuous strategy sets, and pure observable strategies. 
It is no longer a finite game: the set of strategies of i are A (5^) and the histories 
are hn = (si, . . . , Sn), with Sk G A(5^) x . . . x A{S'^). 

f) If the players have different discount factors, we may get feasible payoffs 
and even equilibrium payoffs outside D. For example, in the battle of sexes 
(cf. Example 1), if player 1 is much more patient than player 2, it may be an 
equilibrium strategy to play first (Bottom, Right), and next (Top, Left). So 
we may have an equilibrium payoff close to (2,2) in the repeated game with 
different discount factors for the two players. 

g) In overlapping generation models people play during a finite number K 
of stages. Generations follow each other; each generation lives for K periods, 
and is then replaced by a new generation, starting their lives. 

In this case, a folk theorem holds with simpler conditions (see [15], [16], [30], 

[31]). 

For instance, assume the people live two periods, one in which they are young, 
and one in which they are old. Assume that at each period one young and 
one old are faced and that the game is a prisoner’s dilemma. The following 
strategies are in equilibrium: play cooperatively when you are young and non 
cooperatively when you are old; if a young person deviates, he will be punished 
when old by a new young. 

h) Another question is whether this result can be extended to extensive form 
games. In extensive form games, the information of the players at a terminal 
node is not precisely defined. But one usually assumes that people know the 
node they have reached. Take the following game with player 1 and 2: 
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Example 3: 




In the normal form of this game, the set of strategies of player 1 is {L, R}, 
and the set of strategies of player 2 is {aa, ab, ba, bb}. In the extensive form 
repeated game, if player 1 played L in one period, and if he saw A, he knows 
that the strategy of player 2 was aa or ab, but he does not know which of them. 
In the normal form repeated game under standard signalling, he can distinguish 
aa from ab. Hence repeating an extensive form game is equivalent to repeating 
a normal form game with non standard signalling (see [35]). 

i) Another point of view would be the following: instead of looking for conditions 
under which, asymptotically, the set of equilibrium payoffs is equal to the set 
of feasible individually rational payoffs, one can try to characterize the set of 
equilibrium payoflFs in general. This problem is studied in a paper by Wen [36]. 

2 General signalling 



We are going to consider here the same kind of models as in the previous section, 
but now the players only get signals about the past. They do not know the whole 
history. In many cases this kind of models is more realistic than the previous 
one, but there is no straightforward generalization of the results of the previous 
section. The general signalling framework is much more difScult to study. 

2.1 The Model 

The model is basically the same as in the previous section, except that the 
players are no longer informed of the moves and only get a private signal on 
them. 
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More precisely, is the finite set of private signals for player i and is the 
signalling function of player it is a map from S to A^. More generally, one 
can define a function h from S to x . . . x A'^); in that case the law of the 
signal received by player i is the z’th component of h{s)] this allows for random 
signals. This more general framework can be useful when dealing with moral 
hazard situations. But here we will restrict ourselves to the case of deterministic 
signals i.e. to the case where there exist functions \ S A^. 

As before, the repeated game associated to G is played in stages : 

Stage 1: each player i chooses a move sMn 5 = ( 5 ^, . . . , is not 
announced. Each player i is now informed of h^{s) = a\ 

Stage n: if Sk = {s\, , 5 ^), and ak = (a^, . . . , a^) denote the profile of 
moves and signals at stage A:, then at stage n, knowing = (aj , . . . , a^_i) each 
player i chooses s\ in 5^, and is informed of h'^{sn) = cl\- 
So the players know only a function of the previous moves, for instance, in 
a Cournot framework, they observe the realized price and not the vector of 
quantities. 

In the extensive form game presented in example 3, xy) = x {K stands 

for L 01 R and x and y for a or b). 

In this framework, standard signalling corresponds to the case where is the 
identity function: the signal each agent gets is the profile of moves. 

In the two player finite case, one represent each function by a matrix. For 
example, if there are two strategies and two players, let he represented by : 




This means that if 1 plays top, he cannot observe if 2 played left or right, 
whereas he can observe it if he plays bottom. 

Note that we always assume that each player knows the moves he played and 
recalls all his previous information. Hence we usually define through signalling 
matrices only the incremental information at each stage, for each player, in 
addition to his own move. 

Remark: public information (all the players have the same information) and 
perfect recall (each player always remembers all he did and all he knew) imply 
standard signalling. 

Consequences 

You cannot rely on histories, and you may not be able to check deviations. So 
it may be more difficult to define a cooperative plan. 

Moreover, it may be the case that some players get signals allowing them to 
correlate their moves, without another player, say z, knowing about it. In that 
case, to evalutate the individually rational level for player z, one must take the 
minimum over S^) and not over A{S^), i.e. player z must take into 

account the fact that his opponents may correlate their moves. So the threat 
point may be different from the previous value v. 
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Moreover, the individually rational value may change during the game. In fact 
the correlation between the players depends only on the signals they get hence 
on the actions taken. 

This general problem is an open question. Remark nevertheless that in the two 
player case v can be defined as before because there is no problem of correlation. 

2.2 Recursive Structure and Public Signals 

In this section we concentrate on discounted games, subgame perfect equilibria, 
and public signals. Some more results will first be needed in order to characterize 
the set of subgame perfect equilibria in the discounted game with public signals. 
The tools we introduce here are inspired from the work of Abreu [1], Abreu, 
Pearce and Stacchetti [2] and Mertens [24]. 

2.2.1 Optimality Principle 



In this paragraph we will moreover restrict ourselves to standard signalling. We 
show a few useful results that we will extend later to public equilibria of games 
with general signalling. 

The first result will be used to characterize the set E'^. 

Definition: A one shot deviation of player i from a behavioral strategy t is a 
behavioral strategy s such that: 

- there is a history h after which the mixed move induced by s differs from the 
one induced by t. 

- after any other history, s and t induce the same mixed moves. 

Lemma 2.1: {Optimality Principle) In a repeated game with continuous pay- 
offs, a strategy profile a is a subgame perfect equilibrium iff there is no one shot 
profitable deviation, for any player i. 

Sketch of the proof : 

The condition is obviously necessary. 

Let cr be a profile of strategies, that is not a subgame perfect equilibrium. Hence 
there is a player, i and a pure strategy that is a profitable deviation for i 
from (7^ after some history h. 

Let 6{n) be the following strategy: play a as long as h has not occurred; when 
h has occurred play for n stages and then return to play cr\ As the payoff is 
continuous, there is an integer W* such that for all n > N*, 6{n) is a profitable 
deviation from 

Consider the smallest n, say N such that 9{n) is a profitable deviation from 
Thus 0{N) is a profitable deviation from cr\ and 9{N - 1) is not. Hence there 
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is a history h! such that playing Q{N - 1) as long as hh' has not occurred, and 
after history hh', playing instead of once, is a profitable deviation from 

e{N-i). 

Consider now the following strategy: play a'^ as long as hh' has not occurred; 
after hh' play for one period and then play (j\ It is a one stage profitable 
deviation from ■ 

This lemma will help us to characterize E'^ in the standard signalling case. The 
idea is the following: in a discounted game, the game starting today and the 
game starting tomorrow are of the same structure. There is a kind of stationarity 
along the play. Denote by / tomorrow’s evaluation of the future payoff, as a 
function of today’s moves. The overall payoff is Ea[Xg{si) {1 — X) f (si)], where 
Si is the first stage profile of moves, and a is the profile of strategies. If cr is a 
subgame perfect equilibrium, the range of / is included in the set of subgame 
perfect equilibrium payoffs (even for si outside the equilibrium path, because a 
is a subgame perfect equilibrium). 

Notation: Let F{W) be the set of functions u : S W, where IT is a fixed 
bounded subset of For / G F{W), let G(/, A) be the one shot game with 
set of players N, pure moves 5^ and payoff function + (1 - A)/. We want to 
characterize E'^ using Ef{X), the set of equilibrium payoffs of G(/, A). 

Let T\ be the following operator : T\{W) = |J/gf(W) 

Theorem 2.2: E'^ is the largest fixed point ofT\. 

Sketch of the proof: 

We show first that is a fixed point of T\, and then that any fixed point of 
T\ is included in E'^. 

1- E'^ is a fixed point of T^. 

Let cr be a subgame perfect equilibrium of G\, and u the associated payoff; 
denote by /(si) the payoff for the future after history si. As the game is dis- 
counted, and as the equilibrium is perfect, f{s) is in E'^ for all s in S. By 
definition of cr, cri (the mixed move induced by cr at stage 1) is a Nash equilib- 
rium of G(/, A), and it leads to the payoff u. Thus u is in Ef{X). Hence for all 
u in E'^, there is a / in F{E'^) such that u is in Ef{X): E'^ C T\{E'^). 

Let now u be in T\{E'y). There is an / in F{E'^), such that u is in Ef{X). As / 
is in F{E'^), there is a subgame perfect equilibrium r{s) leading to payoff f{s) 
for all s in 5; let be the equilibrium strategy in G(/, A), that leads to u. 
Clearly, playing (ai, r(s)) is a subgame perfect equilibrium of G\ that leads to 
payoff u; so, u is in E'y^, and T\{E'^) C E'^ . 

Hence, E'^ is a fixed point of T\- 
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2- If a set A is included in Tx(A), A is included in 
If /i is in A, there exists an /2 in F(A) such that /i is in Ef^{\), with the 
strategy profile ai. f2{s) is in A, for all s, so we can define fs{s), and ^2(5), 
and so on.... We thus define a sequence of strategies and payoffs. We now want 
to show that (ai, . . . , an, • • .) is a subgame perfect equilibrium (an is a function 
of all the previous history). As the game is discounted, the payoff is continuous, 
so the optimality principle applies. It is clear that a induces the payoff /i, 
hence it is enough to show that there is no one shot profitable deviation. But 
this follows from the definition of an since (an+i, • • •) induces the payoff /n+i- 
So fi is in and hence, A C E'^. ■ 

Comments: 

- The result is not true for Nash equilibrium payoffs; in fact, there is no opti- 
mality principle for Nash equilibria. 

- The fact that the game is discounted is also crucial because it implies conti- 
nuity of the payoffs (and enables to use the optimality principle). 

- It is also important because it leads to stationarity: the game starting tomor- 
row is the same as today’s game (which is not true in finite games or in games 
with time dependent discount factors). 

We now want to extend this study to the case of non standard signalling, more 
precisely, to the case of public equilibria of games with signals. 

2.2.2 Application to Public Signals 



Consider the component of the signals that is public and denote by Y the 
corresponding set of public signals. It can be defined through the finest a- 
algebra that is included the information of any of the players. This means that 
a signal is in Y if it is the maximum signal one can give to all the players, in 
addition to their own signal, without changing their information. Denote by 
(j) the function from 5 to T that defines for each profile of moves the public 
component of the signal. 

Consider the following example: player 1 chooses a row and player 2 chooses a 
column. In the matrix the letters a, 6, c, d represent outcomes. 

a b 
c d* 

Each player is informed only of his own choice if the outcome is a, 6, or c, and 
he knows his action and * if the outcome is d. Hence this game is a game with 
private signals, and the public component of the signal is * or not *. 

The previous result extends to public equilibria of the game with non standard 
signalling, in the following sense. 
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Definition: A public strategy for player z is a strategy that depends only on 
the public part of the signals. 

A public equilibrium is a public strategy profile such that no public strategy can 
be a unilateral profitable deviation. 

Lemma 2.3: A public equilibrium is an equilibrium, i.e. if everyone uses public 
strategies, the best reponse is a public strategy 

Proof: To find a best response to some strategies one can look only for 
strategies that are measurable with respect to the finest cr-algebra generated 
by (The argument is the same as in Blackwell’s Theorem in dynamic 

programming.) ■ 

Notations: F*{W) is now the set of functions f : Y ^ W, where W is a fixed 
bounded subset of IRF. Let G*{f,X) be the one shot game with the same sets 
of players and moves, and payoffs + (1 — A)/ o 0; E^{X) is its set of Nash 
equilibrium payoffs. E'^ is the set of public perfect equilibrium payoffs of G\. 
Let be the following operator: T^{W) — Uf^p*(^w)Ef{X). 

Theorem 2.4: E'^ is the largest fixed point ofT^ 

Hence this result gives a characterization of all subgame perfect public equilibria 
in the discounted game. It is not true for all equilibria, the word public is crucial 
here. 

Idea of the proof: 

If one restricts oneself to public histories, i.e. to the sequence of the public 
component of the signal at each stage, the other parts are not used, so one can 
forget them. One can easily check that the same proof as in theorem 2.2 holds.* 

The previous result is true for all A. We would like to get more insight of what 
happens when A is close to 0, as we do in the folk theorem. 

2.2.3 Asymptotic Results 



The following results have been obtained in a series of papers by Fudenberg, 
Levine and Maskin (see [11] and [12]). 

The idea is to study the geometry of E'^ as A goes to 0. Recall that if / is in 
there is a one stage payoff g, and a constellation of payoffs according to 
the first period public signal, /* (/* e F*{E'^)), such that / is an equilibrium 
payoff in the game with payoff A^ + (1 — A)/* o <j). 



Definitions: A point / in a half space W of IRF is W -X- stable if there exists 
f* G F*{W), and a equilibrium in G*(/*,A) such that / = Ea[Xg{s) + (1 — 
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A)/^ o </>(5)]. In words, / is the equilibrium payoff of a game where the payoff 
tomorrow is in 

A half space W is \-reproducing if: 

\/f eW, f is M^-A-stable. 



Notation: A half space W is defined by a vector a G such that ||a|| = 1 
and a number fc G iR, such that W = {u\ <u,a >< k}. 

Wa{X) is the maximal (for the inclusion) A-reproducing half space defined by a 
and some real k = fc(a, A). 

Lemma 2.5: W is X-reproducing iff there is an f in its boundary that is W-X- 
stable. 

Proof: 

We only have to prove the sufficiency. So let / be such that: 

3/' e F*(W), a e f = E,[Xg{s) + (1 - X)f o 0(s)]. 

Take any £ in W. Then it is easy to see that associated to the same strategy 
(7, £ is an equilibrium payoff in the game G*{f' -\- {£ — /)/(! — A), A) (^ — / is a 
constant). 

As / is on the boundary of W, f' + {£ — /)/(! — A) is in F*(W). Hence £ is 
W-A-stable, and W is A-reproducing. ■ 

Lemma 2.6: Wa{X) is independent of A. 

Proof: 

Let W be a A-reproducing half space, of direction a, and let f he a point in W. 
Write / = E^[Xg + (1 - A)/* o 0] with /* G F*{W) and tj G E},{X). 

For any //, define 

This equation defines a function £ in F*{W) if p is in [0,1]. 

For p = [A'(l — A)]/[A(1 — A')] {p is between 0 and 1 iff A > A'), one can write: 

f^E,[X'g{s) + {\-X')£o<j>{s)]. 

One has now to prove that a G F^|(A'). Since a G Ej^{X), 

Ea-^A>^9 + (1 - A)/* o d>] < E,[Xg + (1 - A)/* o cj>]. (1) 

Let us prove the same kind of inequality, for £ and A'. 

+ [A'ff + (l-A')(M/* + (l-M)/)]o0] 

< / because of (1) 

< E„[X'g + {l-X')eo(P] 
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Hence a is in so that W is reproducing for A' < A, and Woc{\) C Wq,(A'), 

for A' < A. 

Let us now prove the converse. 

It is enough to take / on the boundary of W (see lemma 2.5). 

/* o (j)[s) is in and as /i is bigger than 1, /* o (j)[s) is between / and ^ o 0(s), 
for all s. Hence, I is in F'^{W). 

One can prove as before that a is in and deduce that for A' > A, 

^a(A) C Wc^iX'), 

Hence, VA, A', Wa{X) = W^iX'). ■ 

Notations: Let us simply denote Wq,(A) by Wa and let Q be 
We have the following result : 

Theorem 2.7: If the interior of Q is nonempty, then 

lim E'^ = Q 
x-*o ^ 

Sketch of the proof : 

a) Proof of E'^ c Q 

Let E be the convex hull of E'^, f an extreme point of E, H a. supporting 
hyperplane containing / and W the half space of boundary H containing E. 

As / is an extreme point of E, it is in E'^ . So there is a /* in F*{E'^) and 
a e Ej^ such that / = E(j[Xg{s) 4- (1 — A)/* o </)(s)]. 

Hence / is W-A-stable, so that W is A-reproducing and if a is the direction of 
W,W C Wa. Finally E c Wa. 

This is true for all /, extreme point of E. Hence E is included in all the sets 
Wa, such that a is the direction of a supporting hyperplane of E, through an 
extreme point of E. This implies E C Q. 

b) Proof of Q C limA-.o 

The interior of Q is nonempty, Q can thus be approximated by convex compact 
sets with a boundary. Let be such an approximation. 

Let / be on the boundary of Qe. We can consider the tangent hyperplane to 
Qe at /, H, and denote by W the half space of boundary H containing Qe- W 
is included in some Wa, by definition of Q. Let f' be the continuation payoff 
expressing that / is W^-A-stable. For all s in 5, / - f'{s) is of the order of A 
and the distance between H and is of the order of A^, so for A small enough, 
f'{s) is in Q for all s. 

Hence Q C T\{Q) for A small enough, and we have already seen that this implies 
that Q C E'^. m 

Comments 

a) In Fudenberg and Levine [11], the result is shown in the framework of a game 
with short run and long run players; if L is the number of long run players, the 
result is: if dim(Q) = L, then limA-^o = Q- 
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This is due to the fact that the short run players (that live only for one day) 
play one shot best responses to the strategies of the long run players. So you 
can compute, for each strategy profile of the long run players, the strategies of 
the short run players, and hence the payoffs. You are then reduced to the case 
with only long run players and upper semi continuous payoff correspondence. 

b) The result relates limA-.o to a quantity that depends only on the one 
shot game as in the folk theorem. But in general, Q is different from F. In fact 
there are two issues: 

- Is the set of public equilibria equal to the set of equilibria? The answer is no. 
It is clearly shown by a game with 3 players: player 3 is a dummy and has no 
signal on the moves, while the other two players observe the moves. There is no 
public signal, and there are obviously some equilibria where player 1 and player 
2 correlate their moves, using their information. 

- Is the set of equilibrium payoffs equal to FI 

In a game where there is no signal (except, for each player, his own move), the 
set of equilibrium payoffs is the convex hull of E\ (all you can do is play a one 
shot equilibrium at each stage). Hence the answer is no. 

The example of the partnership game in Pudenberg and Levine [11] shows that 
the answer is no even if the players have some information. Signals may intro- 
duce a lack of Pareto optimality. 

c) The same results hold if the public component of the signal is random. 

2.3 Private Signals 

This section follows the work of Lehrer (see [20]). There are many other results 
by Lehrer concerning different kinds of equilibria. For more details, we refer to 
[18], [19], [21], [22], 

We want to study the case of private signals and look for results concerning 
equilibria in games without discounting. We consider the two player case. 
Recall that after each move s, player i is told h^{s). We assume that i knows 
his own move, and that signalling is nontrivial in the following sense: 

Vz, such that /i^(s^s-^)^/l^(s^t-^). 

Hence the players are able to communicate through actions. Otherwise, at least 
one player has no information at all on his opponent’s behavior and the analysis 
is straightforward. 

Since the signalling is not standard, there are usually no subgames, because 
past histories are not common knowledge. Hence we will not deal with subgame 
perfection. In a Nash equilibrium with standard signalling, given any history. 
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the future behavior of the players is a Nash equilibrium, on the equilibrium 
path. But in the present situation, the players don’t have the same information 
about the past, so it is not the case. 

After any history, each player has a private signal on the random variable de- 
scribing the future history for a given strategy profile. Then, on a Nash equilib- 
rium path of the original game, the strategies induce a correlated equilibrium of 
the game starting after some history. Correlated equilibria appear naturally in 
this framework (for more precise definition of this notion we refer to the chapter 
on “Communication, correlation and cooperation” by this author in the present 
book), and they are easier to characterize. This is why we will focus on corre- 
lated equilibria rather than on Nash equilibria. 

We first define a few relations between moves. 

Definition: Two moves of player i, s'^ and are equivalent ~ t^) if: 

i.e. player j cannot distinguish between and 

Definition: The move 5^ is at least as informative as ( 5 ^ >- t'^) if: 

~ f and j ^ 2 , Vs-^, W , [h\t\s^) ^ h^{f,t^)] => ^ h{s'^,t^)] 

This means that player j cannot distinguish s^ from t'^ but also that s'^ is at least 
as informative as If z is asked what he knows about his opponent’s move, he 
can give as good an answer if he is playing as if he was playing So, if i 
deviates from t'^ to player j will never detect it even by asking questions to 
i about the signals he received. 



The following example shows the difference between the two definitions. Take 
a game with two players, where each of them has two possible strategies. The 
following matrices are the signalling matrices of the two players: 

i) 1 ) 

If player 1 is supposed to play Bottom, and deviates and plays Top, the infor- 
mation of player 2 is the same, whatever he does. Hence Top^Bottom. But if 1 
plays Top he cannot distinguish the situation where player 2 plays Right from 
the one where he plays Left, whereas if he had played Bottom he could have 
done it. 

This notion is difficult to extend to more than two players: if there are three 
players, and if player 1 knows that 2 has deviated, can he tell it to 3, in order 
for them to correlate their moves to punish 2? 

It doesn’t extend immediately either to the case of random signals. 
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Definition: 

= {Q; Qe A{S^ x 5^), G S\ W e such that ^ 

Esi Qis\s^)g{s\s^) > E^i Q{t\s^)g{t\s^)} 

The interpretation is that Q is a correlation device (see Chapter “Communi- 
cation, correlation and cooperation” in this book) according to which a signal 
is chosen. The players are asked to follow the recommendation given by the 
signal they receive (player i receives the signal with probability Q^(s)). Q 
is in if a deviation from the strategies prescribed by the signal is detectable 
(eventually through questions about the information of the deviating player) or 
non profitable. 

Theorem 2.8: The set of correlated equilibrium payoffs in the infinitely re- 
peated game is giOieN ^ 

Remark: Recall that with private signals, there is a problem even to define the 
individually rational level; but in the two player case, there is no problem of 
correlation for punishing, so the definition is the same as in the case of standard 
signalling. 

Sketch of the proof : 

a) Proof that any correlated equilibrium payoff is in ^(H^eN ^ 

As in the usual proof of the folk theorem, if a plan leads to a non-IR payoff, 
player i can deviate to get at least at each stage. 

If a plan leads to a payoff that is not in 5 f(P|^ C^), with positive probability, by 
definition of C^, a player i has a non detectable, profitable deviation. In fact, 
in a set of stages that has a positive density in the set of stages, there exists 
a move that induces a better payoff and the same signal. Moreover, this move 
is at least as informative as the recommended one. Suppose a player deviates 
and plays this move. He can construct a fictitious history as follows: he can 
compute for each stage, the signals he would have received if he had played 
as recommended, and hence the moves he would have played. If he is asked a 
question about his information at any stage, he can answer according to this 
fictitious history. Hence, with positive probability a player has a non detectable 
profitable deviation. 

b) Proof of the converse. 

The proof is involved and introduces a lot of new ideas, so we can only indicate 
some of the main points. 

Let X be a payoff in p(pj^ C^) fi IR induced by some Q. 

The idea is to choose the moves at random at each stage according to Q and 
the players are supposed to follow this recommendation . Then if one player 
deviates, the deviation will be detected and he will be punished. To detect a 
deviation that may be equivalent to the proposed strategy (but less informative), 
the players have to communicate, and to ask each other questions about what 
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they know. They communicate through actions to ask these questions, hence 
non trivial signalling is used here. 

More precisely, a profile of pure moves s is chosen according to Q. The players 
are supposed to play according to this choice s. In order to detect deviations, 
a player must be able to check that the move his opponent played is equivalent 
to the recommended one. This may not be possible if Q is not of full support. 
Even if Q is of full support, a player may not be able to check that the move his 
opponent played is as informative as the recommended one if he never knows 
the recommendation the other player received. The idea is to define a sequence 
Qn converging to Q, such that each Qn is of full support and with a small 
probability (small enough not to affect the payoff because of the new possible 
deviations it introduces) a player is informed of s and not only of 
The strategies are as follows: 

A profile of pure moves s is chosen according to Qn for the block n. The players 
are supposed to play according to their recommendation for a large number of 
stages, say 2^. Then they begin a communication phase of length 2n where 
each player chooses at random a number p smaller than 2’^ and tells it to the 
other player (which can be done in n periods) who must tell in response what he 
played at stage p, and what he knows of his opponent’s move at stage p. If player 
i cannot give all the information he should have if he had played according to 
the recommendation, (that is known to player —i with some small probability) 
i is punished for ever. As the game is infinitely repeated, a profitable deviation 
implies deviating at infinitely many stages. Hence, with probability one, if a 
player deviates and uses a strategy that is not at least as informative as the 
recommended one, the deviation will be detected. Hence the only deviations 
that may be profitable are the ones that cannot be punished in this way, i.e. 
that cannot be detected by the above procedure. The only possibly profitable 
deviations for player z, from a strategy are that are less informative than 
By definition of such deviations are unprofitable. 

The communication phase is short as compared to the other, hence it does not 
infiuence the payoff. 

Hence altogether, x is a correlated equilibrium payoff. ■ 



Conclusion 



In repeated games, folk-theorem-like results are numerous and sometimes in- 
volved, in particular when non standard signalling is concerned. These results 
connect the one shot game with the equilibria of the repeated game. 

More precisely, under standard signalling, they express the set of equilibrium 
payoffs of the repeated game as the set of feasible and individually rational 
payoffs, hence relate a strategic notion (equilibrium) to a cooperative notion 
(feasible payoffs). The results show that repetition gives the possibility, under 
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some conditions to achieve any individually rational Pareto efficient payoff, i.e. 
to cooperate; but there are also many other equilibrium payoffs that are not 
optimal and that can be achieved as well. Repetition may lead to cooperation. 
A further question is whether these “most cooperative” payoffs (Pareto optimal 
ones) are in some sense more likely to arise. 

Achieving cooperation may fail if full monitoring is not assumed. In fact, if 
there is not enough public information, two phenomena arise: first it is more 
difficult to define a public plan (which is one of the cooperative aspects: social 
norm), moreover it is harder to make it self enforcing (since deviations are less 
likely to be observed, suspicion will occur). Hence in a non standard signalling 
framework, cooperation may be more complicated to obtain and a lack of Pareto 
optimality is more likely to occur. 
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The purpose of this presentation is to introduce models of extension of games 
with preplay or intraplay information and communication. These extensions 
will allow us to define new notions of equilibria. The relevant question is to see 
how the outcomes change when communication between players is allowed, or 
when they are given some kind of preplay information. 

This section will be divided in two parts: the first one about concepts and 
the second one about mechanisms. We first define the basic tools (correlation 
device, correlated equilibrium, communication equilibrium...). Then, we provide 
different mechanisms of communication that lead to these equilibria. 

The concepts studied here have been introduced by Aumann [1] and Forges [4] . 
The mechanisms presented in this section are essentially the works of Barany 
[3], Forges [6] and Lehrer [10]. For related surveys and further results see [12], 
[16] and [13]. 

1 Concepts 

We are going to introduce three concepts: correlated equilibrium, extensive form 
correlated equilibrium and communication equilibrium. All these concepts are 
extensions of the notion of equilibrium in a game. These new equilibria are Nash 
equilibria of some new games, that are extensions of the original one. They con- 
sist in two parts, one related to the problems of correlation and communication 
among the players, independently of the initial game, and the other one to the 
strategic play in the game itself. 

1.1 Correlated Equilibrium 

The concept is due to Aumann [1]. We will first illustrate the idea with two 
examples. 

^ Notes written by Dinah Rosenberg and revised by the author (December 1994). 
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1.1.1 Examples and Intuition 
Example 1 

Consider the battle of sexes. There are two players, He and She. He plays the 
rows and She plays the columns. She wants to go to the theater and He wants 
to go to the movies. T and L stand for the movies and B and R for the theater. 
The payoffs are as follows : 



L R 

T /(2,1) (0,0)\ 

B 1^(0,0) (1,2); 

The problem of this game is how the players can correlate their moves. Their 
common interest is obviously to be in the same place at the same time, so they 
would like to correlate their choices in order to achieve it. 

Suppose the two players are told an integer between 0 and 9 chosen at random. 
They decide to go to the movie if the integer is less than 4, and to go to 
the theater if it is greater than 5. One can check that whatever number is 
announced, no one has an incentive to deviate from this plan. Moreover, the 
expected payoflF is (3/2, 3/2), which is not feasible in the one shot game. In this 
case, the players coordinate through public messages. 

This coordination leads to a distribution on the cells of the payoff matrix : the 
cell (2,1) is reached with probability 1/2, the cell (1,2), with probability 1/2, 
and the cells (0,0) with probability 0. 

The following example is an example of correlation where, unlike example 1, the 
players play according to private signals. 

Example 2 

Consider the following game G : 

L R 

T ((7,2) (0,0) \ 

B V(6,6) (2,7) J 

Suppose there are three messages: white, grey, black, that appear with proba- 
bility (1/3, 1/3, 1/3). Suppose player I can distinguish white from grey or black, 
and player II can distinguish white or grey from black. They get respectively 
the signals W, GB, or WG, B. In that case the signals are not public. The 
probability of joint messages can be represented by the following matrix: 

WG B 
W / 1/3 0 \ 

GB V 1/3 1/3 ) 
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The game is played as follows: 

- a colour is chosen at random; 

- each player receives the private signal corresponding to this choice; 

- the players choose actions in G. 

Consider the following strategies : 

Player I : if the signal is W, play Top, and if the signal is GB^ play Bottom. 
Player II : if the signal is WG, play Left, and if the signal is B, play Right. 
Then no player has an incentive to deviate from these strategies. 

The induced distribution on the matrix leads to the payoff (5,5) which is outside 
the convex hull of the set of Nash equilibrium payoffs, which are: 

{(7, 2), (2, 7), (14/3, 14/3)}. 

These are examples of correlated equilibria. Let us come to the formal definition 
expressing that given the signal and the ex-post probability it induces on the 
other players’ signals, hence on their moves, the strategies are best responses to 
one another. 

l. 1.2 Definition and Properties 

Let G be a game in normal form, defined by a finite set of players N of cardinality 

m, a finite set of pure moves for player i, 5% and a payoff function, g : S = 

IR^. As usual —i denotes the set of players in N except i and 
S~'^ = Ylj^i • Let C be a probability space {Q, .A,p), is the space of states 
of nature). There are no restrictions about C, 

Let 6'^ be a measurable function from C to the finite set of signals of player i, 
AL 6'^ is the signalling function of player i. 

Remark: the game G is fixed, it does not depend on the state of nature. The 
probability space defines only a signal in order for the players to make non 
independent choices; but the actual game they play is known to everybody, this 
being the difference with a Bayesian game. 

Formally we have: 

Definition: A correlation device is a n + 1-uplet of the form 7 = (C, , 0'^) 

Then we introduce: 

Definition: Let us define a new game G^, G extended by 7 , as follows (it is a 
two stage game): 

Stage 1 : a; G is chosen according to p. 9 '^{uj) is told to player i. 

Stage 2 : each player chooses an action in G. 
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A strategy for player z is a function a'^ : Q. ^ A(5^), such that if 9'^{uj) = 
then cr*(o;) = measurable with respect to z’s information 

partition induced by 9'^ on il. 

If each player i plays the mixed move < 7 ^ (a;), and if oj is chosen, the resulting 
profile of mixed moves is a{u) = (cr^(o;), . . . , cr"^(a;)), and the payoff of a player 
j is g^{a{ij))= The payoff of player j in if the players play 

the strategies a is thus 0 -^(cr) = f^g^(a(cj))p(dcj). 

The players’ actions depend on what they know about uj. As their informa- 
tion about it is correlated, the signal about cu enables them to correlate their 
moves, though strategic independence is kept. Correlation is achieved through 
an exterior signal about which two players are at least partially informed. 

Definition: A Nash equilibrium of is a correlated equilibrium of G\ let C 
be the set of all correlated equilibrium payoffs of G obtained by letting the 
correlation device 7 vary. E{Gy) denotes the set of Nash equilibrium payoffs of 
G^, then C = UyE{Gy). 

For the connection between this notion and the notion of sunspot equilibria, we 
refer to Peck [15]. 

Comparison between two correlated equilibria is difficult because since the 
spaces C and the signals may be different, the set of strategies may also dif- 
fer. So, to compare two correlated equilibria, one looks at the distribution they 
induce on 5. 

Definition: If cr is a profile of pure strategies in G^ (i.e. for all a;, a{u) belongs 
to S) and 5 is a profile of pure strategies, Qa{s) is the probability of the event 
{w ; a{cj) = s}. 

This definition can be extended to mixed strategies : if cr is a profile of mixed 
strategies, and Pa(uj){s) is the probability of the pure move s, if is played, 
namely, pa(u;){s) = one has 

Qa{s)= / Pa(u;){s) p{duj). 

JQ 

Q represents the way the players correlate their moves through the signals. In 
the examples we saw that they achieve in this way some probability distributions 
on outcomes that are not achievable if the players choose independently mixed 
strategies. 

Q can be represented by a matrix on 5: 

In example 1, 

Q is 



1/2 0 
0 1/2 
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In example 2, 

^ ( 1/3 1 ^ 

The problem is to characterize the set of distributions that can be obtained 
from a correlated equilibrium. We have a well defined map that associates to 
any game G and any device 7, a game then one can consider its Nash 
equilibria, and associate to each of them the distribution it induces over S. 
Let us denote by CED the set of correlated equilibrium distributions, i.e. the 
set of distributions Q over S such that there exists a correlated device 7 and 
an equilibrium of inducing Q. We define now a specific class of correlated 
equilibria : 

Definitions: 

i) A canonical correlation device, is a correlation device such that Q. = S , 

V {S) (subsets of S) and 6'^{s) = s'^ {lj is a profile of moves and the 
signal of each player is his own move). 

ii) A canonical correlated equilibrium is a correlated equilibrium such that : 

- the correlation device is canonical; 

- player z’s equilibrium strategy is with cr^(s) = 5% (if i received the signal 
then he plays s^). 

Recall that is actually a function of i’s signal, hence a is the identity. 

Theorem 1.1 (canonical representation): The set of correlated equilibrium dis- 
tributions is equal to the set of canonical correlated equilibrium distributions. 

Remark: This theorem is a version of the revelation principle (see [14]). 

Proof : Let us denote by CCED the set of canonical equilibrium distributions. 
-We have to show that CED is included in CCED. 

Take p in CED. Let us denote by H the original game, and by G an extension 
that induces p. Define G' as a canonical extension of H such that the proba- 
bility distribution on S is p. Let us prove that for each player, following the 
recommendation is an equilibrium strategy. 

In G' each player has less information than in G. Hence if there is a profitable 
deviation in G', it was also available in the original extension of the game since 
in G', the player knows only the move and not the signal he would have received 
in G. 

As the payoff depends only on p and not on the correlation device 7, the devi- 
ation was also profitable in the original game. 

So, if p is in CED, there can be no profitable deviation from the recommendation 
given by the signal given in G'. Hence p is in CCED. ■ 

Comments: 

- To determine the set of correlated equilibria one can restrict oneself to the 
following framework: each player first receives a recommendation about what 
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action to choose (the recommendations for the players are chosen jointly and at 
random according to a fixed probability), then each player chooses an action, 
and it is an equilibrium to have each of them follow the recommendation. 

- This theorem allows us to restrict the analysis to canonical equilibria, and 
hence makes computations more tractable. 

- Nevertheless, in constructive proofs, it may be useful not to restrict ourselves 
to canonical equilibria, and to use the great diversity of possible correlation 
devices. 

We now come to some properties of correlated equilibria. 

Property 1.2: Q is in CED iff for all i in N and for all pair of moves, (s^, t'^), 
Q{s\s-^)g\s\s-^)> Yi Q{s\s-^)g\t\s-^) 



Proof: 

let Q be a correlated equilibrium distribution. It can be induced by a canonical 
correlated equilibrium. 

The expected payoff of i, receiving signal and playing is 

YQ{s\s-^)g\t\s-^) 

(if the other players follow the recommendations i.e. play 

The equilibrium condition says that for every i, for every and every 

YQ{s\s-^)g\s\s-^) > 



The set of correlated equilibrium payoffs will be denoted by CP. 

Property 1.3: CED and CP are a convex polyhedra. 

Proof : 

The equations expressing that Q is in CED are linear, and one can therefore 
concentrate on pure strategies. We then have a finite collection of linear in- 
equalities defining CED, which is hence a convex polyhedron. 

Since the map that associates to a distribution on S the corresponding payoff 
is linear, CP is a polyhedron as well. ■ 

Property 1.4: The set of Nash equilibrium distributions is contained in the 
set of correlated equilibrium distributions, which is hence nonempty. 
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Remark: As CP is a convex polyhedron, there is an elementary proof based on 
the separation theorem of convex polyhedra showing that the set of correlated 
equilibria is nonempty (see [9]). 

In particular, it implies that if the problem one starts with has rational pa- 
rameters, there are correlated equilibria with rational parameters. Hence, if all 
Nash equilibrium distributions have irrational parameters, there is a correlated 
equilibrium distribution that is not a Nash equilibrium distribution. 

In the following example, due to Moulin and Vial, there exists a correlated 
equilibrium payoff that strictly dominates all Nash equilibrium payoffs. 
Example 3: The game is represented by the following matrix: 

/(5,4) (4,5) (0,0) \ 

(0,0) (5,4) (4,5) 

V(4,5) (0,0) (5,4)/ 

The unique Nash equilibrium has both players playing (l/3,l/3,l/3), and in- 
duces a payoff of (3,3). 

We now exhibit a canonical correlated equilibrium that induces the (stricly bet- 
ter) payoff (9/2, 9/2). It is defined by the canonical device represented by the 
following matrix : 

/ 1/6 1/6 0 \ 

0 1/6 1/6 
Vl /6 0 1 / 6 / 



We now come to Aumann’s approch of correlated equilibria in terms of bayesian 
maximisation. 

1.1.3 Aumann’s Theorem 

The issue of this section has been studied by Aumann in [2] . 

The framework is the following. Assume you have a collection of players and 
a probability distribution P on Cl that represents the beliefs of a player about 
all players’ actions. We assume that P is the same for all the players, and that 
each player is bayesian rational, and that the players play according to P. Then 
the initial probability is a correlated equilibrium distribution. 

Rationality and bayesian players imply that the players play according to P and 
that it is a best response to the situation where the others play according to P. 
The proof is very simple but the result is striking: it expresses the emergence 
of correlated equilibria as the expression of the rationality of bayesian players 
with consistent beliefs. 
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1.2 Extensive Form Correlated Equilibrium 

Take an extensive form game. You can reduce it to a normal form game, and find 
the correlated equilibria. They define the correlated equilibria of the extensive 
form game. 

But in the original formulation you may do more: before each stage of the 
game, a state of nature is chosen and a signal is given. You extend the game 
by a family of such correlation devices, and consider the Nash equilibria of the 
extended game. They define the extensive form correlated equilibria. 

Definition: Let G be a finite game played in T stages (1,...,T). An extensive 
form correlation device for G is a collection of sets M/ for each player j and 
each period t, and a probability function pt for each period t on M/ x . . . x Mp. 

The extensive form game is here a game that can be played in stages: there is 
a public calendar for all the players. It is an extensive form game in the sense 
of Von Neuman. 

The game G^, G extended by the extensive form correlation device 7 is the 
game played in 2T stages as follows {k G {!,..., T}): 

- stage 2fc — 1: a message (m^, . . . , m^) in x . . . x is chosen according 

to the probability pfc, and player i is informed of m\. 

- stage 2k: knowing their messages, the players who have to play at stage k of 
the game G, choose an action available at that stage k of G. 

Definition: An extensive form correlated equilibrium of game G is a Nash 
equilibrium of an extension of G by an extensive form correlation device. 

You can define as before a canonical extensive form correlated equilibrium, an 
get a canonical representation theorem. 

A canonical extensive form correlation device is a correlation device such that 
at each time t, the signal is a profile of moves s, feasible at stage t and each 
player z’s signal is the component of this signal. 

A canonical extensive form correlated equilibrium is an extensive form corre- 
lated equilibrium such that the correlation device is a canonical extensive form 
correlation device, and such that at any time, if the last signal of player i has 
been his best response to it is s'^ (it is an equilibrium strategy to follow the 
recommendation given by the signal) 

Theorem 1.5: The set of extensive form correlated equilibrium distributions is 
equal to the set of canonical extensive form correlated equilibrium distributions. 

The following example due to Myerson [14], shows that the set of extensive form 
correlated equilibrium distributions may strictly exceed the set of correlated 
equilibrium distributions. 
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Example 4: We focus on the canonical extensive form correlated equilibria of 
this game, and on the canonical correlated equilibria. 

The game is as follows: 




The game is played in two stages: 

Stage 1: we call the node at this stage a; player I must choose between U 
and D; if he chooses U the payoff is (2,2); if he chooses D, we go to stage 2. 

Stage 2 : we call the node at this stage 6; the players then play the following 
normal form game : 

L R 

T /(5,0) (0,0) \ 

B\,(0,0) (0,5); 

Suppose a message is sent at point b. The signal is (T, L) with probability 1/2, 
and {B,R) with probability 1/2. The first component is the message received 
by player I, and the second component is the message received by player II. 
The strategies of player I are, at b : 

- if the message is T, play Top; 

- if the message is play Bottom; 

The strategies of player II are : 

- if the message is L, play Left; 

- if the message is i?, play Right. 

These strategies define a correlated equilibrium of the subgame starting at 6, 
leading to the payoff (5/2, 5/2), and hence, player I will play D at point a. We 
have thus an extensive form correlated equilibrium with payoff (5/2, 5/2). 

If messages can only be sent at the begining of the game, player I will never 
follow a recommendation telling him to play D followed by B since this strategy 
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is strictly dominated. So player II will never get more than 2 in a normal 
form correlated equilibrium. Hence in this example, there is an extensive form 
correlated equilibrium payoff that is not achievable as a normal form correlated 
equilibrium payoff. 

We will consider now a more general notion, where the players can communicate, 
i.e. where they can influence the signals they receive. 

1.3 Communication Equilibrium 

The concepts of this section are due to Forges [4]. We now allow players to send 
signals to a machine and to receive signals from this machine at each period 
of time. The machine is independent of the game, because if the players could 
choose the machine, an informed player could manipulate the information. We 
extend the game by this information structure and the Nash equilibria of the 
extended game are the communication equilibria of the original game. 

Remark: If players are allowed only to receive messages from the machine, but 
not to send messages, the definition reduces to the previous one of extensive 
form correlated equilibrium, and if they can only receive a message at the first 
period, the definition reduces to the one of normal form correlated equilibrium. 

We come now to the formal definition. 

Definitions: Let G be a game played in T stages. 

i) A communication device d for the game G is a collection of sets of inputs 
for any player j at any time t, //, and of set of outputs for any player j at 
any time t, Ol , and of eventually random functions pt from Y[j H Tlj • 

ii) The extension Gd of the game G by the communication device d is the 
following game: each stage t is composed of three parts; during the first one 
each player j sends to the machine an input il in ; then he receives the output 
p{(ij ^ . . . ,2^^) = from the machine; knowing o^, he chooses an action as he 
would have done at stage t of game G. 

iii) A communication equilibrium of the game G is a Nash equilibrium of a 
game Gd extended by a communication device d. 

Notice that the communication device does not depend on the game. If it were 
the case, the strategic issues would be totally different because the players could 
try to influence the machine. This would change the strategic structure of the 
game and the result would not be an extension of the original game but rather 
another strategically different game. 
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We have a theorem of canonical representation : 

Definition: A canonical communication device is a communication device such 
that is the set of information of player j at time and Ol is the set of moves 
of player j at time t. 

Theorem 1.6: The set of distributions over strategies that can be induced by 
a communication equilibrium is equal to the set of distributions of strategies 
that can be induced by a canonical communication equilibrium. 

We have the same convexity property: 

Property 1.7: The set of correlated equilibrium payoffs, the set of extensive 
form correlated equilibrium payoffs, and the set of communication equilibrium 
payoffs are convex polyhedra. 

Proposition 1.8: The set of communication equilibrium distributions may be 
stricly bigger than the set of extensive form correlated equilibrium distributions. 



This is due to the fact that in the extension of a game by a communication 
device, the players can both correlate their moves through the signals they 
receive, and transmit information to one another through the signals they send 
to the machine. 

The proof is given by the following example : 

The game is a game of incomplete information. One of the following games Gi 
and G 2 is chosen with probability 1/2, and player I is informed of the true game 
while player II is not. 



Gi 



G2 



T 

B 



T 

B 



L 


R 


(1,1) 


(0,0) 


(1,1) 


(0,0) 


L 


R 


(0,0) 


(1,1) 


(0,0) 


(1,1) 



The difference between the set of communication equilibrium distributions and 
the set of extensive form correlated equilibrium distributions is due to the pos- 
sibility to send messages that allow for communication. Hence the informed 
player, who is a dummy, can transmit his information to the uninformed player. 
Let us describe the machine and the strategies: 

- the machine sends back all the messages it receives; 

- player I sends the message T if fc = 1, and B if k = 2; when player II 
receives the message he is hence informed of the state of the world. 

- player II plays L if the message he receives is T and R if the message is 
B. 
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These strategies are obviously a communication equilibrium and they lead to 
the payoff (1,1). 

In any extensive form correlated equilibrium, player II has no information about 
the true game , and hence he plays in the following average game (as player 1 
is a dummy, his information has no impact): 

/ ( 1 / 2 , 1 / 2 ) ( 1 / 2 , 1 / 2 ) \ 

V (1/2, 1/2) (1/2, 1/2); 

The payoff is (1/2, 1/2), hence the result. ■ 

We have defined three different notions of equilibria, that take into account the 
fact that players may communicate, or use some tools to correlate their actions. 
We have also seen some of their properties. The three notions of equilibria 
we have defined here introduce more and more opportunities of communication 
through machines. But one must be careful not to introduce too many oppor- 
tunities in order not to affect too much the structure of the game. For example 
if there were a different correlation device for each node of an extensive game, 
it would affect the information sets of the players. 

We now want to see if one can achieve these equilibria without relying on some 
outside machine, but only on what the players can do by themselves. 

2 Mechanisms 



We are looking for mechanisms through which the players can generate by them- 
selves the different class of equilibria discussed above. 

2.1 Correlated Equilibrium Through the Phone 

This part follows the work of Barany [3]. The idea is to allow the players to have 
private conversations, i.e. to communicate through the phone. In that case we 
will see that they can achieve any correlated equilibrium. 

More precisely, any correlation device can be achieved through private conver- 
sation if there are at least four players. 

The framework is as follows: There are m players. Each player i has a finite 
set of pure strategies S\ S is the product of 5^, ... , S'^. The players want to 
correlate their choices, i.e. to choose s e S with a predetermined probability p, 
each player k knowing only s^. 
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This can be done with a machine that chooses s in S with probability p, and 
sends the message to k, i.e. through a correlation device. Can the players 
find a protocol that enables them to do this without the help of a machine? 

Definition: A protocol is a finite set of rules describing, for each step r, which 
player is active and what action this active player should take at this stage. 

Definition: The information of player k at stage r is composed of all the 
messages he sent and received, and all the choices he made up to that stage. 
This set of information is denoted by 

Definition: An action of player k at stage r is one of the following things : 

- Make a random choice z!^ knowing compute a message (the rules give 

functions such that = m^), and send it to a player specified by the 

rules (call him and tell him m^). 

- Compute s^ knowing (the rules give functions such that fr{Ir) = 

- If for some there is no sequence of random choices of the other players that 
leads to this set of information, send to everybody the message “Deviation” . 

Definition: The player k deviates from the rules if for some sequence (for 
r such that k is active), there is an r* such that the message k sends is not in 

Definition: A sure protocol is a protocol such that : 

( 1 ) 

p{s\ ...,S^) = Prob{f\l % . . . , /-(/-)) 

( 2 ) 

p{s \ = Prob{f\r), ni^)\I>^) 



(3) A deviation from the rules is detected with probability 1. 

(4) A deviation consisting of choosing with a probability different from the 
one prescribed by the rules does not influence (1) and (2). 

This means that the random device mimics p and that the information the play- 
ers get is given by p. It means also that a deviation is detectable or innocuous. 
We can now state the result: 

Theorem 2.1: For any probability distribution p over S, that is rational valued, 
if m > 4, there is a sure protocol that induces the correlation device p. 

An immediate corollary of this theorem says that any rational correlated equi- 
librium of a game G can be achieved as a Nash equilibrium of an extension of 
the following kind : 
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Stage 1: the players make phone calls; 

Stage 2 : the players play the game G. 

The proof is quite long and composed of many steps, and we refer to the paper 
of Barany [3]. Here are some brief ideas. 

As p is rational valued, you can restrict yourself to the case where you have 
to implement a uniform probability over a given set. For instance to choose a 
number in {1,2,3}, with 1 having probability 1/2 and 2 and 3 probability 1/4, 
you can choose a number in {1, 1, 2, 3} uniformly. 

In order to achieve Q, the players are supposed to make random choices, and 
to communicate them to other players. It may happen that one of the players 
prefers some issues to some others. In that case, he may want to deviate: he 
may want to lie when he is asked to send some information to another player. 
The idea is that each random choice is known by at least two players; they both 
tell it to a third one who can check whether he received the same information 
from both; in that way he can check that no one has lied. Every message is 
always received at the same time from two sources, to permit checking. 

In order to get the same message from two sources, a player always sends the 
same message to two different interlocutors. They can then send it to a fourth 
player, that will receive two similar messages. 

Moreover, no one should know everything. That is the reason why you need at 
least four players. 

A second aspect is the following: to prevent the players from cheating you 
must not give them too much information. That is why the signals are encoded 
through permutations of the sets of actions, chosen with uniform probability by 
the players. 

For instance, if the players want to choose a number at random in a finite set 
A, they can ask player 1 to choose a number x in X at random and to tell 
it to players 3 and 4. Then player 2 chooses a permutation of X at random, 
q, and tells it to 3 and 4. Players 3 and 4 then compute q{x), which is the 
requested number. Players 1 and 2 cannot influence the final distribution. This 
is a “jointly controlled lottery” and it is much easier to obtain if simultaneous 
moves are available. 

Recall that this result relates correlated equilibria of a game to Nash equilibria 
of an extension of it. 

2.2 Communication Equilibrium with Incomplete Infor- 
mation 

We want to generate any communication equilibrium distribution as correlated 
equilibria of the game extended by plain conversation as in Forges [6] . 
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Let G be a game with incomplete information (see Harsanyi, [8]). Given m 
players, we denote by Ki the finite set of possible types of player z, and by 5^ 
his finite set of actions. K denotes the product of iiTi, . . . , Km, and S denotes 
the product of 5^, , S'^. 

Let p be a probability on X, and g the payoff function from K x S to 
g depends on the real vector of types, i.e. the game depends on the state of 
nature. 

The game is played in the following way : 

- Step 1 : fc is chosen in K according to p; player i is informed of ki in Ki. 

- Step 2 : each player i chooses a move Si in 5L 

The extension of G through plain conversation is the following game : 

- Step 1 : A: is chosen in K according to p; player i is informed of his signal 
ki. 

- Step 2 : each player may send public messages. 

- Step 3 : each player i chooses a move Si in 5L 

Remark: This extension is a particular case of an extension by a communica- 
tion device. 

We have the following result (Forges [6]): 

Theorem 2.2: If m > 4, the set of communication equilibrium distributions 
of G is the set of correlated equilibrium distributions of G extended by plain 
conversation. 

More precisely, any communication equilibrium distribution can be achieved in 
the following way: 

- step 1: k is chosen in K according to p, and player i is informed of ki; 

- step 2: the players send public messages, twice, from a finite set of messages; 

- step 3: the players choose actions. 

Comments: 

- The mechanism is independent of the game, the priors, etc... 

- For three players, the result is true if infinite sets of messages are allowed at 
the second time where the players send messages. 

Sketch of the proof: 

Only one inclusion has to be proved (the second one is obvious). 

Let Q he a canonical communication equilibrium distribution. It induces a 
probability Q' on mappings from K to S: let / be such a mapping, then Q'(f) = 

UkQimm- 

The first idea is the following: 

Let / be selected according to Q\ and fi be told to player i. Then each player 
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learns his type ki^ and player i sends the public message ki. Then f{k) is played. 
This mechanism would lead to the right probability Q. But players get extra 
information that they may use to find new profitable deviations. 

The new idea is thus to use this kind of mechanisms but to encode both fi and ki. 
For the detailed proof, we refer to the paper of Forges [6] . Hence permutations 
of Ki and K x are chosen with uniform probabilibty by the players (so as to 
prevent deviations as in the previous mechanism) to encode the elements of Ki 
and KxS\ m 

Starting with a game of incomplete information, one can apply Theorem 2.2, 
and then Theorem 2.1, to get the following result : 

Theorem 2.3: In a rational game with incomplete information, and at least 4 
players, any communication equilibrium distribution can be realized as a Nash 
equilibrium of the game extended by two phases of conversation : one (private) 
before the types are announced, and one (public) afterwards. 

2.3 Mediated Talk 

We now want to generate a correlation device with no restriction to 4 players and 
no private conversation or signal. This section follows Lehrer [10] and Lehrer 
and Sorin [11]. 

Let G be a m player game. We extend G by a pre~play communication phase. 
In this phase player i selects possibly randomly a signal from a finite set Ai, 
and sends it to a mediator. The mediator announces a public and deterministic 
message p{ai, . . . ,am)- 

Then each player chooses an action that may obviously depend on /x(a), and on 

Ui . 

We have the following result. 

Theorem 2.4: Let Q be a correlated equilibrium distribution of G. Assume 
that all the probabilities of Q are rational numbers. Then there exists a ^‘public 
mediated talk’’ extension of G, as defined above, that has a Nash equilibrium 
that induces Q. 

The fact that all the coeflScients are rational permits us to restrict the study to 
uniform distributions. We will show how the proof goes on an example. 
Consider the following 2x2 game with 2 players : 



(7,7) (3,8) 

(8,3) (0,0) 
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We want to generate the following correlated equilibrium : 

[1/2 1/4 Y 
Vl/4 0 J 

There are four different signals (a, 6, c, d), each of them corresponding to a cell 
of the matrix (a corresponds to Top-Left, b to Top-Right, c to Botttom-Left, d 
to Bottom- Right). Suppose that the mediator sends the signals to the players 
according to the following matrix denoted by M(a, 6, c, d): 
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fa 
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b 


c\ 
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a 


a 
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[a 


b 


c 


a) 



If the signal is a it means that player I is told to play Top and player II to play 
Left. 

Then, if the players choose the numbers at random, they get the signals with 
the right probability. 

But, we also want the information of the players to correspond to the correlation 
device: for instance, if player I has chosen 1, and the signal is a, we want 
him to think that player II will play Left with probability 2/3 and Right with 
probability 1/3, and we want him to play Top. 

If we consider the signals as a recommendation, and if we suppose that the 
players will follow the recommendations, using the above matrix player I knows 
what player II will play if the signal is a. 

That is why we have to duplicate the matrix. We encode the message that will 
be received later. So, we construct a big 2x2 matrix according to which the 
mediator will compute the message he sends to the players. 

Player I chooses at random a couple in {T, J5} x {1, ...,4}. Player II chooses at 
random a couple in {L,R} x {1, ...,4}. 

Each cell of the big matrix is a signalling matrix as defined above : in the cell 
(T,L) put M(a, 6, c, d), in (T,R) put M(6,a, d,c), in (B,L) put M(c, d, a, 6), and 
in (B,R) put M{d,c,b,a). The matrix is thus the following : 
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The first component of the messages (the pair of letters) determines the matrix 
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chosen and the second one (the pair of numbers) the cell in the submatrix. In 
the big matrix player I chooses a row and player II a column. 

The mediator sends as a public message the letter corresponding to the choice of 
the players : a if the choice was (T, 1, L, 1), 6 if the choice was (T, 1, i?, 1) and so 
on... T, jL, i^,'are codes. Each block of the matrix gives the right probability 
to each signal, and the signification of each signal in terms of action is encoded 
through the block to which it belongs. 

Note that each player has only a partial information about this block. 

The strategies are the following : 

Player 1 : 

If he sent T and the signal is a or 6 then he plays T. 

If he sent T and the signal is c or d then he plays B. 

If he sent B and the signal is a or 6 then he plays B. 

If he sent B and the signal is a or 6 then he plays T. 

Player 2 : 

If he sent R and the signal is a or c then he plays R. 

If he sent R and the signal is 6 or d then he plays L, 

If he sent L and the signal is a or c then he plays L. 

If he sent L and the signal is a or d then he plays R. 

Now the signalling matrices give the right ex-post probabilities on the signal the 
other player sent, and hence on his action : if player I sent (T, 1) and the signal 
is a, he knows we are on the first row. Hence with probability 1/3, player II 
sent (-R, 1), with probability 1/3, (i^, 2), and with probability 1/3, (L,3). So the 
outcome (Top, Left) will appear with probability 2/3, and (Top, Right) with 
probability 1/3, if player II follows the recommendation. The matrix leads to 
the desired information structure. 

If one of the players does not choose his message uniformly, the probability 
to get a given signal will be the same, and also the information he gets. So 
there can be no profitable deviation at that stage. Given the messages and 
the information they give, the fact that Q was initially a correlated equilibrium 
distribution implies that there is no profitable deviation at the action stage. 

In fact the result is more precise: given any correlation device there is a “public 
mediated talk” device that mimics it, i.e. that induces the same outcome and the 
same information (compare with a sure protocol, but here there is no “deviation” 
message). 




217 



Conclusion 

In this lecture, we saw new notions of equilibrium related to the possibility for 
the players to correlate their moves and to communicate. 

Pre-play information and communication give the players larger strategy sets, 
and hence permits to achieve outcomes that are Pareto prefered to all Nash 
equilibrium outcomes. 

One has to define precisely the kind of communication that is allowed: it has a 
great impact on the nature of the equilibria one obtains. 

Cooperation is a notion that is concerned with the behavior of the players, 
namely their moves. But we have seen that it may come from another level: 
through signals and information, the players can enrich their strategy spaces 
and generate correlation that, in fine, induce the requested consequence on 
their moves. 
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1. Introduction 

Economists have for long expressed dissatisfaction with the complex 
models of strict rationality that are so pervasive in economic theory. There are 
several objections to such models. First, casual empiricism or even just simple 
introspection lead to the conclusion that even in quite simple decision problems, 
most economic agents are not in fact maximizers, in the sense that they do not 
scan the choice set and consciously pick a maximal element from it. Second, such 
maximizations are often quite difficult, and even if they wanted to, most people 
(including economists and even computer scientists) would be unable to carry 
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them out in practice. Third, polls and laboratory experiments indicate that people 
often fail to conform to some of the basic assumptions of rational decision theory. 
Fourth, laboratory experiments indicate that the conclusions of rational analysis (as 
distinguished from the assumptions) sometimes fail to conform to "reality." And 
finally, the conclusions of rational analysis sometimes seem unreasonable even on 
the basis of simple introspection. 

From my point of view, the last two of the above objections are more 
compelling than the first three. In science, it is more important that the 
conclusions be right than that the assumptions sound reasonable. The assumption 
of a gravitational force seems totally unreasonable on the face of it, yet leads to 
correct conclusions. "By their fruits shall ye know them" (Matthew). 

In the sequel, though, we shall not hew strictly to this line; we shall 
examine various models that, between them, address all the above issues. 

To my knowledge, this area was first extensively investigated by Herbert 
Simon (1955, 1972). Much of Simon's work was conceptual rather than formal. 
For many years after this initial work, it was recognized that the area was of great 
importance, but the lack of a formal approach impeded its progress. Particular 
components of Simon's ideas, such as satisficing, were formalized by several 
workers, but never led to an extensive theory, and indeed did not appear to have 
significant implications that went beyond the formulations themselves. 

There is no unified theory of bounded rationality, and probably never will 
be. Here we examine several different but related approaches to the problem, 
which have evolved over the last ten or fifteen years. We will not survey the area, 
but discuss some of the underlying ideas. For clarity, we may sometimes stake 
out a position in a fashion that is more one-sided and extreme than we really feel; 
we have the highest respect and admiration for all the scientists whose work we 
cite, and beg them not to take offense. 

From the point of view of the volume of research, the field has "taken 
off in the last half dozen years. An important factor in making this possible was 
the development of computer science, complexity theory, and so on, areas of 
inquiry that created an intellectual climate conducive to the development of the 
theory of bounded rationality. A significant catalyst was the experimental work 
of Robert Axelrod (1984) in the late seventies and early eighties, in which experts 
were asked to prepare computer programs for playing the repeated prisoners' 
dilemma. The idea of a computer program for playing repeated games presaged 
some of the central ideas of the later work; and the winner of Axelrod's 
tournament — tit-for-tat — was, because of its simplicity, nicely illustrative of the 
bounded rationality idea. Also, repeated games became the context of much of the 
subsequent work. 

The remainder of this lecture is divided into five parts. First we discuss 
the evolutionary approach to optimization — and specifically to game theory — 
and some of its implications for the idea of bounded rationality, such as the 
development of truly dynamic theories of games, and the idea of "rule rationality" 
(as opposed to "act rationality"). Next comes the area of "trembles," including 
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equilibrium refinements, "crazy" perturbations, failure of common knowledge of 
rationality, the limiting average payoff in infinitely repeated games as an 
expression of bounded rationality, 8-equilibria, and related topics. Section 3 deals 
with players who are modeled as computers (finite state automata, Turing 
machines), which has now become perhaps the most active area in the field. In 
Section 4 we discuss the work on the foundations of decision theory that deals 
with various paradoxes (such as Allais (1953) and Ellsberg (1961)), and with 
results of laboratory experiments, by relaxing various of the postulates and so 
coming up with a weaker theory. Section 5 is devoted to one or two open 
problems. 

Most of this lecture is set in the framework of non-cooperative game 
theory, because most of the work has been in this framework. Game theory is 
indeed particularly appropriate for discussing fundamental ideas in this area, 
because it is relatively free from special institutional features. The basic ideas are 
probably applicable to economic contexts that are not game -theoretic (if there are 
any). 



2. Evolution 

2.1 Nash Equilibria as Population Equilibria 

One of the simplest, yet most fundamental ideas in bounded rationality 
— indeed in game theory as a whole — is that no rationality at all is required to 
arrive at a Nash equilibrium; insects and even flowers can and do arrive at Nash 
equilibria, perhaps more reliably than human beings. The Nash equilibria of a 
strategic (normal) form game correspond precisely to population equilibria of 
populations that interact in accordance with the rules — and payoffs — of the 
game. 

A version of this idea — the evolutionarily stable equilibrium — was first 
developed by John Maynard Smith (1982) in the early seventies and applied by 
him to many biological contexts (most of them animal conflicts within a species). 
But the idea applies also to Nash equilibria — not only to interaction within a 
species, but also to interactions between different species. It is worthwhile to give 
a more precise statement of this correspondence. 

Consider, then, two populations — let us first think of them as different 
species — whose members interact in some way. It might be predator and prey, 
or cleaner and host fish, or bees and flowers, or whatever. Each interaction 
between an individual of population A and one of population B results in an 
increment (or decrement) in the fitness of each; recall that the fitness of an 
individual is defined as the expected number of its offspring (I use "its" on 
purpose, since strictly speaking, reproduction must be asexual for this to work). 
This increment is the payoff to each of the individuals for the encounter in 
question. The payoff is determined by the genetic endowment of each of the 
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interacting individuals (more or less aggressive or watchful or keen-sighted or 
cooperative, etc.). Thus one may write a bimatrix in which the rows and columns 
represent the various possible genetic endowments of the two respective species 
(or rather those different genetic endowments that are relevant to the kind of 
interaction being examined), and the entries represent the single encounter payoffs 
that we just described. If one views this bimatrix as a game, then the Nash 
equilibria of this game correspond precisely to population equilibria; that is, under 
asexual reproduction, the proportions of the various genetic endowments within 
each population remain constant from generation to generation if and only if these 
proportions constitute a Nash equilibrium. 

This is subject to the following qualification: in each generation, there 
must be at least a very small proportion of each kind of genetic endowment; that 
is, each row and column must be represented by at least some individuals. This 
minimal presence, whose biological interpretation is that it represents possible 
mutations, is to be thought of as infinitesimal; specifically, an encounter between 
two such mutants (in the two populations) is considered impossible. 

A similar story can be told for games with more than two players, and for 
evolutionary processes other than biological ones; e.g., economic evolution, like 
the development of the QWERTY typewriter keyboard, studied by the economic 
historian Paul David (1986). It also applies to learning processes that are perhaps 
not strictly analogous to asexual reproduction. And though it does not apply to 
sexual reproduction, still one may hope that roughly speaking, similar ideas may 
apply. 

One may ask who are the "players" in this "game"? The answer is that 
the two "players" are the two populations (i.e., the two species). The individuals 
are definitely not the "players"; if anything, each individual corresponds to the 
pure strategy representing its genetic endowment (note that there is no sense in 
which an individual can "choose" its own genetic endowment). More accurately, 
though, the pure strategies represent kinds of genetic endowment, and not 
individuals. Individuals indeed play no explicit role in the mathematical model; 
they are swallowed up in the proportions of the various pure strategies. 

Some biologists object to this interpretation, because they see it as 
implying group or species selection rather than individual selection. The player 
is not the species, they argue; the individual "acts for its own good," not the good 
of the group, or of the population, or of the species. Some even argue that it is 
the gene (or rather the allele) that "acts for its own good," not the individual. The 
point, though, is that nothing in this model really "acts for its own good"; nobody 
"chooses" anything. It is the process as a whole that selects the traits. The most 
we can do is ask what it is that corresponds to the player in the mathematical 
model, and this is undoubtedly the population. 

A question that at first seems puzzling is what happens in the case of 
interactions within a species, like animal conflicts for females, etc. Who are the 
players in this game? If the players are the populations, then this must be a one- 
person game, since there is only one population. But that doesn't look right, either, 
and it certainly doesn't correspond to the biological models of animal conflicts. 
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The answer is that it is a two -person symmetric game, in which both 
players correspond to the same population. In this case we look not for just any 
Nash equilibria, but for symmetric ones only. 



2.2 Evolufionaiy Dynamics 

The question of developing a "truly" dynamic theory of games has long 
plagued game theorists and economic theorists. (If I am not mistaken, it is one 
of the conceptual problems listed by Kuhn and Tucker (1953) in the introduction 
to Volume II of "Contributions to the Theory of Games" — perhaps the last one 
in that remarkably prophetic list to be successfully solved.) The difficulty is that 
ordinary rational players have foresight, so they can contemplate all of time from 
the beginning of play. Thus the situation can be seen as a one-shot game each 
play of which is actually a long sequence of "stage games," and then one has lost 
the dynamic character of the situation. 

The evolutionary approach outlined above "solves" this conceptual 
difficulty by eliminating the foresight. Since the process is mechanical, there is 
indeed no foresight; no strategies for playing the repeated game are available to 
the "players." 

And indeed, a fascinating dynamic theory does emerge. Contributions to 
this theory have been made by Young (1993), Foster and Young (1990), and 
Kandori, Mailath, and Rob (1993). A book on the subject has been written by 
Hofbauer and Sigmund (1988) and there is an excellent chapter on evolutionary 
dynamics in the book by van Damme (1987) on refinements of Nash equilibrium. 
Many others have also contributed to the subject. 

It turns out that Nash equilibria are often unstable, and one gets various 
kinds of cycling effects. Sometimes the cycles are "around" the equilibrium, like 
in "matching pennies," but at other times one gets more complicated behavior. 
For example, the game in Figure 1 has ((l/3,l/3,l/3),(l/3,l/3,l/3)) as its only Nash 
equilibrium; the evolutionary dynamics does not cycle "around" this point, but 
rather confines itself (more or less) to the strategy pairs in which the payoff is 4 
or 5. This suggests a possible connection with correlated equilibria; this 
possibility has recently been investigated by Foster and Vohra (1994). 

Thus evolutionary dynamics emerges as a form of rationality that is 
bounded in that foresight is eliminated. 
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2.3 ’IRule Rationality** vs. **Act Rationality** 

In a famous experiment conducted by Gtlth et al. (1982) and later 
repeated, with important variations, by Binmore et al. (1985), two players were 
asked to divide a considerable sum of money (ranging as high as DM 100). The 
procedure was that PI made an offer, which could be either accepted or rejected 
by P2; if it was rejected, nobody got anything. The players did not know each 
other and neVer saw each other; communication was a one-time affair via 
computer. 

"Rational" play would predict a 99-1 split, or 95-5 at the outside. Yet in 
by far the most trials, the offered split was between 50-50 and 65-35. This is 
surprising enough in itself. But even more surprising is that in most (all?) cases 
in which P2 was offered less than 30 percent, he actually refused. Thus, he 
preferred to walk away from as much as DM 25 or 30. How can this be 
reconciled with ordinary notions of utility maximization, not to speak of game 
theory? 

It is tempting to answer that a player who is offered five or ten percent 
is "insulted." Therefore, his utilities change; he gets positive probability from 
"punishing" the other player. 

That’s alright as far as it goes, but it doesn't go very far; it doesn't explain 
very much. The "insult" is treated as exogenous. But obviously the "insult" arose 
from the situation. Shouldn't we treat the "insult" itself endogenously, somehow 
explain it game-theoretically? 

I think that a better way of explaining the phenomenon is as follows: 
ordinary people do not behave in a consciously rational way in their day-to-day 
activities. Rather, they evolve "rules of thumb" that work in general, by an 
evolutionary process like that discussed at 2.1 above, or a learning process with 
similar properties. Such "rules of thumb" are like genes (or, rather, alleles). If 
they work well, they are fruitful and multiply; if they work poorly, they become 
rare and eventually extinct. 

One such rule of thumb is "Don't be a sucker; don't let people walk all 
over you." In general, the rule works well, so it becomes widely adopted. As it 
happens, the rule doesn't apply to Gtith's game, because in that particular situation, 
a player who refuses DM 30 does not build up his reputation by the refusal 
(because of the built-in anonymity). But the rule has not been consciously chosen, 
and will not be consciously abandoned. 

So we see that the evolutionary paradigm yields a third form of bounded 
rationality: rather than consciously maximizing in each decision situation, players 
use rules of thumb that work well "on the whole." 
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3. Peituibations of Rationality 

3.1 Equilibrium Refinements 

Equilibrium refinements — Selten (1975), Myerson (1978), Kreps and 
Wilson (1982), Kalai and Samet (1984), Kohlberg and Mertens (1986), Basu and 
Weibull (1991), Van Damme (1984), Reny (1992), Cho and Kreps (1989), and 
many others — don't really sound like bounded rationality. They sound more like 
super-rationality, since they go beyond the basic utility maximization that is 
inherent in Nash equilibrium. In addition to Nash equilibrium, which demands 
rationality on the equilibrium path, they demand rationality also off the 
equilibrium path. Yet all are based in one way or another on "trembles" — small 
departures from reality. 

The paradox is resolved by noting that in game situations, one man's 
irrationality requires another one's superrationality. 7 ou must be superrational in 
order to deal with my irrationalities. Since this applies to all players, taking 
account of possible irrationalities leads to a kind of superrationality for all. To be 
superrational, one must leave the equilibrium path. Thus, a more refined concept 
of rationality cannot feed on itself only; it can only be defined in the context of 
irrationality. 



3.2 Crazy Peituibations 

An idea related to the trembling hand is the theory of irrational or "crazy" 
types, as propounded first by the "gang of four" (Kreps, Milgrom, Roberts, and 
Wilson (1982)), and then taken up by Fudenberg and Maskin (1986), Aumann and 
Sorin (1989), Fudenberg and Levine (1989), and no doubt others. In this work 
there is some kind of repeated or other dynamic game set-up; it is assumed that 
with high probability the players are "rational" in the sense of being utility 
maximizers, but that with a small probability, one or both play some one strategy, 
or one of a specified set of strategies, that are "crazy" — have no a priori 
relationship to rationality. An interesting aspect of this work, which differentiates 
it from the "refinement" literature, and makes it particularly relevant to the theory 
of bounded rationality, is that it is usually the crazy type, or a crazy type, that 
wins out — takes over the game, so to speak. Thus, in the original work of the 
gang of four on the prisoner's dilemma, there is only one crazy type, who always 
plays tit-for-tat, no matter what the other player does; and it turns out that the 
rational type must imitate the crazy type, he must also play tit-for-tat, or 
something quite close to it. Also, the "crazy" types, while irrational in the sense 
that they do not maximize utility, are usually by no means random or arbitrary (as 
they are in refinement theory). For example, we have already noted that tit-for-tat 
is computationally a very simple object, far from random. In the work of Aumann 
and Sorin, the crazy types are identified with bounded recall strategies; and in the 
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work of Fudenberg and Levine (1989), the crazy types form a denumerable set, 
suggesting that they might be generated in some systematic manner, e.g., by 
Turing machines. There must be method to the madness; this is associated with 
computational simplicity, which is another one of the underlying ideas of bounded 
rationality. 



3.3 Epsilon-Equilibria 

Rather than playing irrationally with a small probability, as in 3. 1 and 3.2 
above, one may deviate slightly from rationality by playing so as almost, but not 
quite, to maximize utility; i.e., by playing to obtain a payoff that is within s of 
the optimum payoff. This idea was introduced by Radner (1980) in the context 
of repeated games, in particular of the repeated prisoners' dilemma; he showed that 
in a long but finitely repeated prisoners' dilemma, there are 8-equilibria with small 
8 in which the players "cooperate" until close to the end (though, as is well- 
known, all exact equilibria lead to a constant stream of "defections"). 



3.4 Infinitely Repeated Games with Limit-of-the-Aveiage Payoff 

There is an interesting connection between 8-equilibria in finitely 
repeated games and infinitely repeated games with limit of the average payoff 
("undiscounted"). The limit of the average payoff has been criticized as not 
representing any economic reality; many workers prefer to use either the finitely 
repeated game or limits of payoffs in discounted games with small discounts. 
Radner, Myerson and Maskin (1986), Forges, Mertens and Neyman (1986), and 
perhaps others, have demonstrated that the results of these two kinds of analysis 
can indeed be quite different. 

Actually, though, the infinitely repeated undiscounted game is in some 
ways a simpler and more natural object than the discounted or finite games. In 
calculating equilibria of a finite or discounted game, one must usually specify the 
number n of repetitions or the discount rate d\ the equilibria themselves depend 
crucially on these parameters. But one may want to think of such a game simply 
as "long," without specifying how long. Equilibria in the undiscounted game may 
be thought of as "rules of thumb," which tell a player how to play in a "long 
repetition," independently of how long the repetition is. Whereas limits of finite 
or discounted equilibrium payoffs tell the players approximately how much payoff 
to expect in a long repetition, analysis of the undiscounted game tells him 
approximately how to play. 

Thus, the undiscounted game is a framework for formulating the idea of 
a duration-independent strategy in a repeated game. Indeed, it may be shown that 
an equilibrium in the undiscounted game is an approximate equilibrium 
simultaneously in all the «-stage truncations, the approximation getting better and 
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better as n grows. Formally, a strategy profile ("tuple") is an equilibrium in the 
undiscounted game if and only if for some sequence of 8„ tending to zero, each 
of its «-stage truncations is an s„"equilibrium (in the sense of Radner described 
above) in the «-stage truncation of the game. 



3.5 Failure of Common Knowledge of Rationality 

In their paper on the repeated prisoners' dilemma, the Gang of Four 
pointed out that the effect they were demonstrating holds not only when one of 
the players believes that with some small probability, the other is a tit-for-tat 
automaton, but also if one of them only believes (with small probability) that the 
other believes this about him (with small probability). More generally, it can be 
shown that many of the perturbation effects we have been discussing do not 
require an actual departure from rationality on the part of the players, but only a 
lack of common knowledge of rationality (Aumann 1992). 



4. Automata, Computers and Turing Machines 

We come now to what is probably the "mainstream" of the newer work 
in bounded rationality, namely, the theoretical work that has been done in the last 
four or five years on automata and Turing machines playing repeated games. The 
work was pioneered by A. Neyman (1985) and A. Rubinstein (1986), working 
independently and in very different directions. Subsequently, the theme was taken 
up by Ben-Porath (1993), Kalai and Stanford (1988), Zemel (1989), Abreu and 
Rubinstein (1988), Ben-Porath and Peleg (1987), Lehrer (1988), Papadimitriou 
(1992), Stearns (1989), and many others, each of whom made significant new 
contributions to the subject in various different directions. Different branches of 
this work have been started by Lewis (1985) and Binmore (1987 and 1988), who 
have also had their following. 

It is impossible to do justice to all this work in a reasonable amount of 
space, and we content ourselves with brief descriptions of some of the major 
strands. In one strand, pioneered by Neyman, the players of a repeated game are 
limited to using mixtures of pure strategies, each of which can be programmed on 
a finite automaton with an exogenously fixed number of states. This is 
reminiscent of the work of Axelrod, who required the entrants in his experiment 
to write the strategies in a fortran program not exceeding a stated limit in length. 
In another strand, pioneered by Rubinstein, the size of the automaton is 
endogenous; computer capacity, so to speak, is considered costly, and any capacity 
that is not actually used in equilibrium play is discarded. The two approaches lead 
to very different results. The reason is that Rubinstein's approach precludes the 
use of "punishment" or "trigger" strategies, which swing into action only when a 
player departs from equilibrium, and whose sole function is precisely to prevent 
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such departures. In the evolutionary interpretation of repeated games, Rubinstein's 
approach may be more appropriate when the stages of the repeated game represent 
successive generations, whereas Neyman's may be more appropriate when each 
generation plays the entire repeated game (which would lead to the evolution of 
traits having to do with reputation, like "Don't be a sucker"). 

The complexity of computing an optimal strategy in a repeated game, or 
even just a best response to a given strategy, has been the subject of works by 
several authors, including Gilboa (1988), Ben-Porath (1989), and Papadimitriou 
(1989). Related work has been done by Lewis (1992), though in the framework 
of recursive function theory (which is related to infinite Turing machines) rather 
than complexity theory (which has to do with finite computing devices). Roughly 
speaking, the results are qualitatively similar: finding maxima is hard. Needless 
to say, in the evolutionary approach to games, nobody has to find the maxima; 
they are picked out by evolution. Thus, the results of complexity theory again 
underscore the importance of the evolutionary approach. 

Binmore (1987 and 1988) and his followers have modeled games as pairs 
(or 77-tuples) of Turing machines in which each machine carries in it some kind 
of idea of what the other "player" (machine) might look like. 

Other important strands include work by computer scientists who have 
made the connection between distributed computing and games ("computers as 
players," rather than "players as computers"). For a survey, see Linial (1995). 



5. Relaxation of Rationality Postulates 

A not uncommon activity of decision, game, and economic theorists since 
the fifties has been to call attention to the strength of various postulates of 
rationality, and to investigate the consequences of relaxing them. Many workers 
in the field — including the writer of these lines — have at one time or another 
done this kind of thing. People have constructed theories of choice without 
transitivity, without completeness, violating the sure-thing principle, and so on. 
Even general equilibrium theorists have engaged in this activity, which may be 
considered a form of limited rationality (on the part of the agents in the model). 
This kind of work is most interesting when it leads to outcomes that are 
qualitatively different — not just weaker — from those obtained with the stronger 
assumptions; but I don't recall many such cases. It can also be very interesting 
and worthwhile when one gets roughly similar results with significantly weaker 
assumptions. 
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6. An Open Problem 

We content ourselves with one open problem, which is perhaps the most 
challenging conceptual problem in the area today: to develop a meaningful formal 
definition of rationality in a situation in which calculation and analysis themselves 
are costly and/or limited. In the models we have discussed up to now, the 
problem has always been well defined, in the sense that an absolute maximum is 
chosen from among the set of feasible alternatives, no matter how complex a 
process that maximization may be. The alternatives themselves involve bounded 
rationality, but the process of choosing them does not. 

Here, too, an evolutionary approach may eventually turn out to be the key 
to a general solution. 
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Abstract This chapter studies the implications of bounding the complexity 
of players’ strategies in long term interactions. The complexity of a strategy 
is measured by the size of the minimal automaton that can implement it. 

A finite automaton has a finite number of states and an initial state. It 
prescribes the action to be taken as a function of the current state and its next 
state is a function of its current state and the actions of the other players. 
The size of an automaton is its number of states. 

The results study the equilibrium payoffs per stage of the repeated games 
when players’ strategies are restricted to those implement able by automata of 
bounded size. 



1 Introduction 

The simplest strategic game quickly gives rise to a game of formidable com- 
plexity when one considers a finitely-repeated version of it. This is because the 
number of pure strategies in the repeated game grows as a double exponential 
of the number of repetitions. To just write down in decimal form the number 
of pure strategies available to a player in the hundred-times repeated pris- 
oner’s dilemma would require more digits than the number of all the letters 
in all of the books in the world. This chapter examines the implications of 
restricting the set of strategies to those that are implementable by finite au- 
tomata of bounded size. Such restrictions place a bound on the complexity of 
strategies and they can (dramatically) alter the equilibrium play of a repeated 
game. 

When we try to argue that an outcome is or is not an equilibrium in a game 
there are direct references to all possible strategies in that game. In the case 
of the hundred-times repeated Prisoner’s Dilemma it would obviously be an 
impossible task to merely write out all of the strategies, let alone construct the 
huge matrix which would constitute the explicit representation of this game 
in normal (strategic) form. Moreover, many of the strategies in the finitely 
or infinitely repeated game are extremely complicated. They may involve 
actions contingent on so many possible past events that it would be nearly 
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impossible to describe them; even writing them down or writing a program to 
execute some of these strategies is practically impossible. We consider here 
a theory that limits the strategies available to a player in a repeated game. 
The restriction is to those strategies that are implementable by bounded size 
automata- the simplest theoretical model of a computer. It turns out that the 
equilibria of the resulting strategic game can be dramatically different from 
those of the original game. 

In the finitely repeated Prisoner’s Dilemma, it is well known that all equi- 
libria, and all correlated equilibria or communication equilibria, result in the 
repeated play of (defect, defect). This is in striking contrast to the experimen- 
tal observation that real players do not choose always the dominant action of 
defecting, but in fact achieve some mode of cooperation. 

The present approach justifies cooperation in the finitely repeated pris- 
oner’s dilemma, as well as in other finitely repeated games, without departing 
from the hypothesis of strict utility maximization, but under the added as- 
sumption that there are bounds (possibly very large) on the complexity of the 
strategies that players may use. 

There are other methods of restricting strategies. I am not going to advo- 
cate here that the avenue we are taking is superior. Each one of the possible 
approaches has its pros and cons. 

The paper surveys results about the equilibrium payoffs of repeated games 
when players’ strategies in the repeated game are resricted. It contains also 
several new results, e.g.. Propositions 2, 3, 4, 5, 6, and 7. There is no attempt 
here to survey all results related to the title, and therefore several important 
and related papers are not covered in this survey. 

2 The Model 

2.1 Strategic Games 

Let G be an n-person game, G = {N, A, r), where N = {1,2, ... ,n} is the set 
of players, A = Xi^jsfAi ,A{ is a finite set of actions for player i, i = 1, . . .,n, 
and r = {r^)i£N where : A IR is the payoff function of player i. The 
set Ai is called also the set of pure strategies of player i. We denote by 
r : A ]R^ the vector valued function whose zth component is r\ i.e., 
r(a) = (r^(fl), . . . , r”(a)). We use also the more detailed description of G, 
G = (N] (Ai)i^N] (^*)i€iv), or G = ((A)ieiv; (r^)ieN) for short. For any finite 
set (or measurable space) B we denote by A{B) the set of all probability 
distributions on B. For any player i and any n-person game G, we denote 
by v^{G) his individual rational payoff in the mixed extension of the game G, 
i.e., u*(G) = minmaxr*(a*, cr”*) where the max ranges over all pure strategies 
of player i, and the min ranges over all 7V\{z}-tuples of mixed strategies of 
the other players, and r* denotes also the payoff to player i in the mixed 
extension of the game. We denote by n*(G) the individual rational payoff of 
player i in pure strategies, i.e., n*(G) = minmaxr*(a*, a“*) where the max 
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ranges over all pure strategies of player z, and the min ranges over all TV \ {z} 
-tuples of pure strategies of the other players. Obviously u*(G) > v^(G). We 
denote by w^{G) the max min of player z where he maximizes over his mixed 
strategies and the min is over the pure strategies of the other players, i.e., 
w^{G) = maXa;gA(Ai) r*(x,a~*) where A-i = Xj^iAj. Recall that 

the minimax theorem asserts that for a two person game G, v^(G) = w^{G). 
For any game G in strategic form we denote by E{G) the set of all equilibrium 
payoffs in the game G, and by F(G) the convex hull of all payoff vectors in 
the one shot game, i.e., F{G) = co{r{A)). Given a 2-person 0-sum game G 
we denote by Val(G) the minimax value of G, i.e., Val(G) = v^{G). 

2.2 The repeated games and G* 

Given an n-person game, G = ((^i)i€iv; we define a new game in 

strategic form G^ = ((E*(r))feiv; which models a sequence of T 

plays of G, called stages. After each stage, each player is informed of what 
the others did at the previous stage, and he remembers what he himself did 
and what he knew at previous stages. Thus, the information available to each 
player before choosing his action at stage t is all past actions of the players in 
previous stages of the game. Formally, let ^ = 1, • • • , be the cartesian 
product of A by itself t — 1 times, i.e., Bt = A^~^, with the common set 
theoretic identification = {0}, and let H = A pure strategy cr* 

of player z in G^ is a function cr* : H A{. Obviously, iV is a disjoint 
union of Htjt= 1 , . . . , T and therefore one often defines (t\ \ Ht Ai as the 
restriction of cr to Ht. We denote the set of all pure strategies of player z in 
G^ by E*(T). The set of pure strategies of player z in the infinitely repeated 
game G* is denoted by E*, i.e., E* = {cr* : Ai}. 

Any TV-tuple cr = . . . , cr**) G XfgjvE*(T) (E*) of pure strategies in G^ 

(in G*) induces a play uj{a) = . . . ,o;t(<^)) ^2(cr), • • •)) 

defined by induction: o;i(cr) = (cr^(0), . . .,C7**(0)) = cr(0) anda;t(o') = cr(a;i(cr), 

. . . ,o;t_i(cr)) or in other words uj\{cr) = fr*(0) and u)]{cr) = o’*(a;i(cr), . . . , 
uJt-i{(T)) = (rj(cJi(o-), . . .,o;t-i(cr)). 

Set 

r(wi(cr)) + ... + r(wT(<r)) 

rT(a) = . 

We define Rt or R for short to be the function from plays of to the 
associated payoffs, i.e., Rt '■ A'^ — »■ is given by 

Rr(a, ^ + - . + 

Two pure strategies cr* and r* of player z in G^ ( in G* ) are equivalent 
if for every TV\{z} tuple of pure strategies (t“* = {(^^)jeN\{i}, ct;t(cr*, cr“*) = 
cr~*) for every l<f<T(l<^). The equivalence classes of pure 
strategies are called reduced strategies. For z G TV let A-i = Xj:^{Aj. Then 
an equivalence class of pure strategies is naturally identified with a function 
-.UZoiA-iY ^Ai. 
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2.3 Finite Automata 

We will consider strategies of the repeated games which are described by 
means of automata, (which are also sometimes referred to as Moore machines 
or exact automata). An automaton for player i consists of a finite state space 
M; an initial state qi E M] a. function / that describes the action to be 
taken as a function of the different states of the machine, f : M A{, where 
Ai denotes the set of actions of player i; and a transition function g that 
determines the next state of the machine as a function of its present state and 
the action of the other players, i.e., g : M x A-i — > M. Thus, an automaton 
of player i is represented by a 4-tuple (M, gi, /, g). The size of an automaton 
is the number of states. 

This machine, the automaton, will change its state in the course of playing 
a repeated game. At every state g E M, / determines what action it will take. 
The next state of the automaton is determined by the current state and the 
action taken by the other players. We can think of such an automaton as 
playing a repeated game. It starts in its initial state gi, and plays at the first 
stage of the game the action assigned by the action function /, /(gi). Thus, 
/(gi) = a\ is the action of the player at stage 1. The other players’ action at 
this stage is bi = E A-*. Thus the history of play before the start of stage 
2 is the n-tuple of actions, (a^, ap), played at the first stage of the game . As 
a function of the present state, and the other players’ actions, the machine is 
transformed into a new state which is given by the transition function g. The 
new state of the machine is g 2 = 9 {Qijbi). The action that player i takes at 
stage 2, fl 2 ) is described by the function / : /(g 2 ) = f{9{9i , ^i)), and denoting 
by action of the other players in stage 2, (/(g2)ja2*) is pair of 

actions played in the second stage of the repeated game, and so on. 

What is the state of the machine at stage t of the game? The ma- 
chine moves to a new state which is a function of the state of the machine 
in the previous stage and the action played by the other players. Thus 
g^ g(g^_i, is the new state of the automaton at stage t, and player i 

takes at stage t the action /(gt) = /(g(gf_i , and so on. 

Define inductively, 

g{q, bi,...,bt)- g{g{q, bi,..., 6t_i), bt), 

where bj E A-{. The action prescribed by the automaton for player i at stage 
t is /(g(gi,a]~^j • • where aj\ 1 < j <t,is the N\{i} tuple of actions 

at stage j. Therefore, any automaton a = (M, gi,/, g) of player i induces a 
strategy in that is given by crj^(0) = f{qi) and 

= f{9(9ucii\ • • - 

Note also that an automaton a of player i induces also a strategy of player 
i in the infinitely repeated game G*. A strategy <7* of player i in G* (in G^) 
is implemented by the automaton ca of player i if is equivalent to cr^, i.e., 
if for every cr“* E (E-^(T)), a;(cr*,cr~*) = u;(o'*^, cr“*). 
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A finite sequence of actions ai, . . . , at is compatible with the pure strategy 
cr* of player i in G*, if for every 1 < 5 < ^, cr®(ai, • • -jag-i) = a*. Given 
a strategy cr® of player i in G*, any sequence of actions ... ,atj induces a 
strategy (<7*|ai, at) in G*, by 

(<t‘ |ai, . . . , at)(6i, ...,bs) = . . ,at,bi, . . . , b,). 

Proposition 1 The number of different reduced strategies that are induced 
by a given pure strategy cr® of player i in G* and all -compatible sequences 
of actions equals the size of the smallest automaton that implements a. 

2.4 Repeated Games with Finite Automata 

Given a strategic game G and positive integers mi , . . . , rrin , we define E® (T, m*) 
( E®(mi)) to be all pure strategies in E®(T) (in E®) that are induced by an 
automaton of size m*. Note that if a strategy is induced by an automaton of 
size rrii and m\ > mi then it is also induced by an automaton of size m'- . The 
game G^(mi, . . . , m^) is the strategic game (TV; (E®(T, mj))j^iv; where 
here is the restriction of our earlier payoff function ry to Xigjv^*(T, m*). 

The play in the supergame G* which is induced by an n-tuple of strategies 
cr — (a-®)igjv with a® G E®(mj) enters a cycle of length d < HieN lifter a 
finite number of stages. Indeed, if at stages t and s the n-tuple of states of the 
automata coincide, then for every nonnegative integer r, ci;t+r(cr) = u; 5 _|_ 7 .(cr). 
As the number of different n-tuples of automata states is bounded by YlieN 
the periodicity follows. Therefore, the limiting average payoff per stage is well 
defined whenever all players are restricted to strategies which are implemented 
by finite automata. The game G^(mi, . . . , m„) or G(mi, . . . , m„) for short, 
is the strategic game (TV; (E®(m 2 ))i^jv; t'oo) where r^o is defined as the limit 
of our earlier payoff function as T — > oo. 

3 Zero-Sum Games with Finite Automata 

In this section we present results of the value of 2-person 0-sum repeated games 
with finite automata. Results concerning zero-sum games are important for 
the study of the non-zero sum case by specifying the individual rational payoffs 
and thus the effective “punishments.” 

Consider the two-person zero-sum game of matching pennies: 



1 


-1 


-1 


1 



Assume that player 1, the row player, and player 2, the column player, 
are restricted to play strategies that are implemented by automata of size mi 
and m 2 respectively. Recall that we are considering the mixed extension of 
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the game in which the pure strategies of player i are those implemented by an 
automaton of size m^. An easy observation is that for every mi there exists a 
sufficiently large m2 , a pure strategy r G E^(m2) and a positive integer T such 
that for any t > T and cr G D^(mi), r^(u;t(cr, r)) = — 1. Therefore, we conclude 
in particular that for the above matching pennies game G = (A,jB,h), for 
every mi there exists m2 such that Val(G(mi, m2)) = max^i min^ /i(a, 6). 
Moreover this statement is valid for any two-person zero-sum game H = 
(A,B,h). Theorem 1 of Ben-Porath (1993) asserts that if m2 > mi|E^(mi)|, 
where for a set A, \X\ denotes the number of elements in X, then 

Val {H{mi , m2)) = maxmin/i(a, 6). 

a^A b^B 

Note that |E^(m)| is of the order of an exponential function of mlogm. How- 
ever, it turns out that if the larger bound m2 is subexponential in mi, player 
2 is unable to use effectively in the long run his larger bound. Indeed, 



Theorem 1 (Ben-Porath, 1986, 1993). Lei H = (A,B,h) he a two person 
0-5wm game in strategic form, and let (m(n))^i he a sequence of positive 
integers with 



lim 

n— *-oo 



log m(n) 
n 



= 0 . 



Then, 



liminf Val (if (n, m(n))) > Val(iJ). 

n— ►OD 



Proof. W.l.o.g. we assume that n < m(n). For every sequence a = 
(aj, 02 , . . .) of actions of player 1 we denote by the pure strategy of player 
1 with cr^(*) = cl] . Note that if a is i-periodic then <7^ G E^(A;). For every 
k, cr^ik) denotes the mixed strategy <j^ of player 1 where X = (Xi,X 2 , . . .) 
is a random A:-periodic sequence of actions of player 1, with Xi,X 2 , . . . ,X^ 
i.i.d and the distribution of Xt is an optimal strategy of player 1 in the one 
shot game. It follows that for every pure strategy r of player 2 and every 
t < k, E(^i(^k),T{h{at,ht)(Ht) > Val(if), where Tit denotes the algebra gen- 
erated by the actions ai, 61 , . . . , Ot_i, ht-i in stages 1, . . . — 1. Therefore 

Probai(k),T{Ylt=i < Val(if) — c) < with C{e) > 0. There- 

fore for every finite set T C 



Pro6(ji(„)(min/in(cr“, t) < Val(if) — s) < \T\ exp(— C(6:)n). (1) 

Let r be a pure strategy of player 2 which is implemented by an automaton of 
size m(n), and set T = {(r | 61 , . . . , 6 ^)}- Then for every n periodic sequence 
a and every positive integer s, 

5+n 5+n 

> min '^))- 

As |T| < m(n), and cr^(n) is a mixture of (at most |A|^) pure strategies of 
the form G 5]^(n), the result follows from (1). ■ 
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It is worth mentioning that the proof implies a stronger result. Setting 
E^(m) to be all strategies cr® such that for each i |{(cr* | : hj G 

5} I < m, and cr^{ri) as constructed in the proof, we conclude that under the 
same condition as in the theorem, 

liminfFa/ > Val (H). 

This stronger result implies that whenever limn-^-oo log m(n)/n = 0, for every 
n there exists a random n-periodic sequence of actions of player 1, (cr^), which 
guarantees approximately the value Val(iJ) against any strategy in E^(m(n)). 
Note that for every pure strategy a of player 1, there exists a strategy r G 
D^(l) with /if(cr, r) < maXa^^ min^^B 6). The next result asserts that 
when m(n) log m(n) = o{n) as n — > oo, then there is a deterministic n-periodic 
sequence of actions of player 1, a, such that cr^ guarantees approximately 
Val(if) when player 2 is restricted to strategies in E^(m(n)). 

Proposition 2 Let m . IN IN with limn->oo — q. Then for 

every n there exists an n-periodic sequence of actions of player 1, a, such that 

lim (inf{/it(cr“, r) | r G E^(m(n)),t > n}) = Val(if). 

n— >oo 

Proof. Note that there is a positive constant K such that |E^(m(n))| < 
Let k : IN IN he such that limn-^oo and 

limn-^oo k(n)/n = 0. X = {Xi ,. . . , Xk(n )^ . . .) be a random n-periodic se- 
quence of actions of player 1, where , . . . , Xk(n) 3-re i.i.d each distributed ac- 
cording to the distribution of an optimal mixed strategy of player 1 in the one 
shot game, and (Xi, . . . ,X„) is Ar(n)-periodic. As limn-^cx) — 0, 

it follows that for every positive constant C > 0, 

lim |E^(m(n))| exp {—Ck{n)) = 0, 

n—*oo 

and therefore it follows from (1) that 

lim Pr( min ^ < Val(iJ) — £:) = 0 

n-i-oo rGS2(m(n)) 

and therefore there is an n-periodic sequence of actions a such that 
lim (inf{ht((T“, r) | r G E^(m(n)),^ > n}) > Val{H). 



The next result follows from the proof of the result of Ben-Porath (1993), 
and is used in the proof of theorems 5 and 6. 

Theorem 2 For every e > 0 sufficiently small, if 

exp{e‘^mi) > m 2 > 1, 

then for every positive integer T, 

Val(iJ^(mi,m 2 )) > Val(iJ) -e. 




240 



The next corollary is a restatement of Theorems 1 and 2 which provides 
a lower bound for equilibrium payoffs in nonzero sum repeated games with 
finite automata. 

Corollary 1 For every strategic game G = (N,A,r), i E N, and e > 0 
sufficiently small, if 

exp{e^mi) >rnj>l for every j / i, 

then for every x E £'(G^(mi, . . or x e E{G(mi , . . . , rrin)), 

X* > w\G) - e. 

The next result asserts that if the bound on the sizes of the automata of 
player 2 is larger than an exponential of the sizes of the automata of player 
1, then player 2 could hold player 1 down to his maxmin in pure strategies. 

Theorem 3 For every 2-person 0-sum game H = ({1,2}; {A, B)\ h), and ev- 
ery positive constant K with K > In \A\, if m{n) > exp{Kn), then 

Val {H{n, m(n)) ^ maxmin /i(a, b) as n oo. 

a£A beB 

Proof. Let K > ln|A|. It is sufficient to prove that for every £: > 0 there 
exists no such that for every n > tiq and every m > exp(Kn), 

Val {H{n, m)) < maxmin /i(a, b) + e. 
aeA b£B 

Note that for every positive constant C there exists no such that for every 
n > riQ, exp{Kn) > Cn^\A\^. Therefore the theorem follows from the next 
Lemma. ■ 

Lemma 1 For every s > 0, there is a sufficiently large integer K K{e) , 
such that for every m > K'n?\A\^ there exists a strategy r* E A(E^(m)) such 
that for every T > K‘^n^\A\^, and any strategy a E L)^(n), 

t*) < maxmin h(a, 6) + e. 

qC^A b^B 



and therefore, 



Val (G^(n, m)) < maxmin /i(a, b) + s. 

a^A b^B 



and 



Val (G(n, m)) < maxmin /i(a, 6) -h e. 

a£A beB 



Proof. The idea of the proof is as follows: for every pure strategy 
a E 5]^(n) of player 1, there is 1 < A; < n and a sequence of actions 
b = bi, . . bn,bn-k-\-i, • • • with bi = bj whenever i > j > n — k and i — j = 
0{mod k), such that the strategy of player 2 which plays the sequence b 
results in payoff < max^g^ minj^B /i(a, 6) in seach stage t > m — k. Such 
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a strategy is implemented by an automaton of size n, and the number of 
such strategies is bounden by n\B\^. The strategy of player 2 immitates 
choosing at random a pair and if the resulting payoffs are not suffi- 
ciently small, it attempts another randomly chosen pair The sufficiently 

large number of states of the automaton of player 2, gaurantees that with 
high probability the induced play will eventually enter a cycle with payoffs 
< maXflg^ ft (a, 6) in each stage . 

Formally, let 6 : A — ^ jB be a selection from the best reply correspondence 
of player 2. Construct the following mixed strategy of player 2, r*, which is 
implemented by an automaton with state space 

where i — Kn\A\‘^. The initial state of the automaton of player 2 is (1, 1). Let 
a : > A be a random function, each such function equally likely, i.e., for 

every I < i < n , and every 1 < i < ^, a{i,j) is a random element of A each 
one equally likely, and the various random elements a(z,j) are independent. 
We define now the random action function of the automaton. 






The transition function of the automaton depends on a random sequence 
k = kiy . . . ,kij I < kj < Tij each such sequence equally likely and the sequence 
k is independent of the function a. We are ready now to define the transition 
function which depends on the functions b and a and on the random sequence 



k. 






< 



(* + i,i) 

(i.i + i) 

( 1 . 1 ) 



if i < n and c = a(z, j) 
if i = n and c = a(z, j) 
if j < i and c / 
otherwise. 



Let cr be a pure strategy of player 1 that is implemented by an automaton of 
size n. Let a?i, ^ 2 , . . . where Xt = (at, 6t) be the random play induced by the 
strategy pair a and r*, and let random sequence of states of 

the automaton of player 2. Fix 1 < j < ^ and let t = tj be the random time 
of the first stage t with = (1, j)* Note that 



Prob{at^s = a(s -f 1, j) V 0 < s < n) = 



1 

|A|«- 



and if at +5 = 0 ( 5 + 1, j) V 0 < s < n then there exists 0 < 5 < n such that the 
state of the automaton of player 1 at stage t An, coincides with its state 
at stage t As. Therefore if kj = s -h 1, the play will enter a cycle in which the 
payoff to player 1 is at most max^^^ miribqB k{a, b). Therefore the conditional 
probability, given the history of play up to stage tj , that the payoff to player 
1 in any future stage is at most maXa^A minb^B h{a, 6), and that tj^i = 00 is 
at least l/(\A\^n). Otherwise, iftj+i < 00 , < tj A Therefore, either 

ii — 00 and then for every stage t > iin? , the payoff to player 1 is at most 
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maXaGA h{a^h)^ or ti < oo. However, the previous inequalities imply 

that, 

Prob{ti = oo) > 1 — (1 — 1 / {n\A\^)Y~^ 1 as K — ^ oo, 

which completes the proof of the lemma. ■ 

It is of interest to bridge between the results of this section concerning the 
infinitely repeated 2-person 0-sum games, by providing asymptotic results of 
the value Val(iJ(mi, m 2 )) when mi 00 and m 2 is approximately a fixed 
exponential function of mi. Given a 2-person 0-sum game H = (A,B,h), 
it will be interesting to find the largest (smallest) monotonic nondecreasing 
functions v : (0, 00 ) — > iK (v ; (0, 00 ) — > IR) such that if > a > 0 as 

m — ^ 00 then 



t»(a) < liminf Val(/f(m, m 2 )) < limsup Val(iJ(m, m 2 )) < t;(a). 

m— >00 YYl—^OO 

Theorems 1 asserts that lima-.o^(<^) = lim^^o ^(c»^) = Val(iir), and The- 
orem 3 asserts that for a > In \A\, v{a) - maXa^A mint^B h{a, b). We conjec- 
ture that the two functions v and v are continuous with i; = t) for all values 
of a > 0 with the possible exception of one critical value. 

The next two conjectures address the number of repetitions needed for a 
unrestricted player to use his advantage over bounded automata. The positive 
resolutions of each of the conjectures have implications on the equilibrium 
payoffs of finitely repeated games with automata. A positive resolution of the 
next conjecture, will provide a positive answer to conjecture 3. 

Conjecture 1 For every 6>0 , if m : IN IN satisfies m(T) > eT, then 

lim Val(if^(m(T),oo)) = Val(i7). 

T— ►00 

The truth of the above conjecture implies that there is a function m : IN 
IN with limT^oo m(T)/T = 0, and such that 

lim Val(iJ^(m(T'), 00 ) = Val(if). 

T->oo 

An interesting open problem is to find the “smallest” such function. The next 
conjecture specifies a domain for such a function. 

Conjecture 2 If m : IN IN obeys limr 00 {T / \ogT) / rn{T) = 0, then 

lim Val(jFf^(m(T), 00 )) = Val(i^). 

T— ^00 

If m : M ^ M obeys limr—oo f^{T) / {T / \ogT) = 0, then 
lim Val(i7^(m(T),oo)) = max min 

T— ►OO a^£Ai a'^EA2 
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4 Equilibrium Payoffs of the Supergame G*^ 



We state here a result, due to Ben-Porath, which is a straightforward corollary 
of his result in the 2-person 0-sum case. All convergence of sets is with respect 
to the Hausdorff topology. Recall that for a sequence of subsets, En^ of a 
Euclidean space , 

liminfjEn = {ar G \ Ve > 0,3N s.t. Vn > A, d(x^En) < s} 

n—^oo 

where d(x, E) denotes the distance of the point x from the set E, 

limsupE'n = {a: G | Ve > 0, VA', 3n > N with d{x,En) < s}, 

n — >-00 

and limnoo En = E if E = liminf En = limsupE'n. 



Theorem 4 ( Ben~Poraih 1986, 1993). Lei G == {N \{Ai)i^M\{P)ieN) be a 
strategic game, and rrii{k), i G N , sequences with limjfc_,cx) Trii{k) = oo and 

log{ma,Xi^N rrii{k)) 

lim ^ — — = U. 

k-*oo mi[k) 

Then, 

{x e F \ x^ > v\G)} C liminf E'(G(mi( A?), . . . , m„(A^))), 

k—^oo 

and 

limsupE'(G(mi(Ar), . . . , rrin{k))) C {ar G A | ar* > w\G)}. 

A;— ►oo 



Note that in two-person games v\G) = w^(G) and therefore the above 
theorem provides exact asymptotics for two-person games. An interesting 
open problem is to find the asymptotic behavior of E{G{mi{k ), . . . , rrin{k))) as 
k oo and limjfc_H.oo{log(maXigiV rrii{k)) / miiii^N mi{k)} = 0. Such questions 
lead to the study of the asymptotics of 



v\G{mi{k), . ..,mn{k))) = minmaxr^((T* , r *), 

T“* CT* 

where the min ranges over all r”* G x j A{T,^ {rrij (k))) and the max is over 
(7* G E*(mi(A:)) and where rrii{k), z G A, is a sequence with limjk_^oo f^i{k) = 
oo and limfc_^oo(log(maxieAr ?Tii(A:)))/(minieA/^ mf(A!)) = 0. W.l.o.g. assume 
that mi{k) < m 2 {k) < • . • < rnn{k) and limn-^oo log m„ (A:) /mi (A?) = 0, and 
let i < n. Set v^{k) — v^{G{mi{k), . . .,mn{k))). We denote by Q(i), or Q for 
short, the set of all probability measures on A-i whose marginal distribution 
on Xj^iAj is a product measure. The following is a partial answer to the 
study of the asymptotics of v^{k). 

Proposition 3 (a) //limjk_oo 



lim sup (A:) < min max ^ q{a 
jfc— oo qeQ a^eAi 

a 



a-i). 
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(h) If for a fixed player I < i < n, limjb_^oo 

limsupv®(fc) < min max g'(a~*)r*(a*, a~*). 

Jk — oo a^eAi ^ 

a-*£A-.i 

Proof. Part (a) follows from Proposition 2. We turn to the proof of part (b). 
Let be a sequence of positive integers with limjfc_,oo N{k)/ log rrii(k) 

= oo and limfc_,oo ^(^)/log^i+i(^) = 0. The constructed N \ {z} tuple of 
minimax strategies, will enter a cycle of length N{k), following the 

first N{k)(n — 1) stages. The cycle play, Xi, . . . , Xjv(jfc), is a sequence of i.i.d. 
actions in with each Xt distributed according to a minimizing probability 
q £ Q. For every k let q*{k) G Q, or g for short, attain the minimum and 
let qj be the marginal distribution of q on Aj^ j < i. Let , j < z, be 
the strategy which plays a random A^(Ar)-periodic sequence of actions 
Xj , ^ 2 , . . . where X{, ^ are i.i.d. and the distribution of each Xj is qj . 

We define next the strategy for j > i, which is a mixture of pure strategies, 
each implemented by an automaton of size which for sufficiently 

large fc is < mi+i(fc). For every 6 = (6i, . . . , 6f_i, 6i+i, . . . , &„) G 
denote by 6^ the projection of b on Xi:^jt^jAj> , and let denote the 

marginal of the conditional probability (^|6^ ) on Aj. The automaton includes 
|A|“^(^) + (j — i — l)N{k) states which are used to record the realization of 
the choices of all players in stages N{k){j — z — 1) + 1, . . . , N{k){j — z) stages. 
Thereafter, player j plays an A/'(A:)-periodic sequence a^, . . . , which is 

a realization of a sequence of independent actions X{, . . . , with the 

distribution of X^ being One verifies that following the 

first (n — l)N(k) stages the play of players j ^ i enters an N{k) cycle of i.i.d 
actions each distributed according to q. ■ 

In the above proof we can also construct the minimaxing strategies of 
players j > i to be pure strategies, as in Proposition 2. Consider the following 
3 player game G. 



0,0,8 


0,0,8 


0,8,4 


0,0,0 



0,0,0 


8,0,4 


0,0,8 


0,0,8 



Player 1 chooses the row, player 2 the column, and player 3 chooses the 
matrix. Note that == 0 = v^{G) and v^{G) = 5. However, w^{G) = 4 

(and w^{G) = w‘^{G) = 0). Therefore we can not deduce from Theorem 
4 whether or not the vector payoff (4,4,4) is approximated by equilibrium 
payoffs of the restricted games G{mi{k),m 2 {k),ms{k)) for sufficiently large 
k and where k < 7rii{k) are sequences with \ogma,xmi{k)/ min rrii{k) 0 as 

k oo. However, Proposition 3 characterizes for this game the limit of the 
equilibrium payoffs provided that we assume in addition that 
limfc_oo logmax(mi(Ar), m 2 (fc))/logm 3 (A:) = oo. In particular, it follows in 
this case that (4,4,4) is in the limit of the equilibrium payoffs. 




245 



We state now a result which provides a partial answer to the asymptotic 
behavior of the set of equilibrium payoffs of repeated games with bounded 
automata. Denote by 

d® = min max g(a“®)r®(a®, a“*) 

and 

F = {xe F{G)\x^ > d®} 

Proposition 4 Assume that mi{k) < ... < mn{k), limjb_oo 

limit_oo ° ^ ^ = 0- Then, 

liminf E{G{mi{k) , . . . , mn{k))) D F 

k-t-cx) 

5 Cooperation in Finitely Repeated Games 

The results in this section address the asymptotic behavior of the sets of 
equilibrium payoffs, £'(G^(mi, m 2 )), of the games G^(mi,m 2 ), as T, mi 
and m 2 go to 00 . All convergence of sets is with respect to the Hausdorff 
topology. In each one of the theorems in the present section we assume that 
G = ({1, 2}, A, r) is a fixed 2-person strategic game, F = F{G) stands for the 
feasible payoffs in the infinitely repeated game, i.e., F = co{r{A)) and that 
(T(n),mi(n),m 2 (n))^=i is a sequence of triples. For simplicity, the state- 
ments of the theorems are nonsymmetric with respect to the two players, and 
therefore we assume in addition that m 2 (n) > mi(n). We also suppress often 
the dependence on n; no confusion should result. 

Theorem 5 Let G = ({1,2}, A, r) be a two 'person game in strategic form, 
and assume that there is x E F(G)) with x^ > v^{G), and x^ > u^{G). Then, 
z/mi(n) 00 and -^0 as n 00 , 

lim inf jE'(G^(mi, m 2 )) D {x £ F \ x^ > v^{G) and x^ > v?{G)}. 

n— *-oo 



Special cases of the above theorem have been stated in previous publica- 
tions. Neyman (1985) states that in the case of the finitely repeated prisoner’s 
dilemma G, for any positive integer k, there is To such that if T > To and 
T^/^ < min(mi,m 2 ) < max(mi,m 2 ) < T^, then there is a mixed strategy 
equilibrium of G^(mi,m 2 ) in which the payoff is l/A:-close to the “cooper- 
ative” payoff of G. Papadimitriou and Yannakakis (1994) state the special 
case of theorem 5 obtained by assuming that the payoffs of the underlying 
game are rational numbers and replacing F{G) in the statement of the theo- 
rem with {x E r(A) with x® > t;®(G)} . They also state a result for a subset 
of F with the additional assumption that the bounds on both automata are 
subexponential in the number of repetitions. 
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The conclusion of the theorem fails if we replace in the assumptions of the 
theorem the strict inequality > v^{G) by the weak inequality > v^(G). 
For example in the game 



0,4 


1,3 


1,1 


1,0 



the only equilibrium payoff in m 2 ) with m 2 > 2^ is (1,1). 

The next theorem relates the equilibrium payoffs of G^ { 1711 , 1712 ) to the 
equilibrium payoffs of the undiscounted infinitely repeated game . Recall 
that the Folk Theorem asserts that 



E{G^q) = {x E F\ x^ > v^{G) and x^ > v^{G)}. 



Theorem 6 Let G = {{1,2}, A, r) he a two person game in strategic form , 
and let {T,mi{T),m 2 {T))^^i he a sequence of triples of positive integers with 
^i(^) ^ '^ 2 {T) and mi(T) 00 as T 00 , and 



ii„ = «■ 

T-^oo mm(mi(T),T) 



Then, 

lim E{G'^{miiT),m 2 iT))) = E{Gl,). 

T—^oo 



The limiting assumption limT-^oo = 0 in theorem 6, could proba- 

bly be replaced by an alternative lower bound, as a function of T, on mi(T), 
provided that we also assume that there is x £ F with x^ > v^{G). One 
example of such a result is presented in the following conjecture. 



Conjecture 3 Let G = {{1,2} , A, r) he a two person game in strategic form, 
and assume that there is x £ F with x^ > v^{G). Then, if 



lminfmi(T)/T > 0, 



and 



lim 

T^oo 



log mi (r) 
T 



= 0 , 



Then 

lim £'(G^(mi,m 2 )) = {x £ F \ x^ > v\G)}. 

T—*oo 

The next theorem is straightforward and very easy. We state it as a con- 
trast to the previous results. It shows that the subexponential bounds on the 
sizes of the automata as a function of the number of repetitions is essential to 
obtain equilibrium payoffs that differ from those of the finitely repeated game 
G^. 



Theorem 7 For every game G in strategic form there exists a constant c such 
that if rrii > exp(cT) then 

E{G^{mu...,m„)) = E{G^). 
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6 Repeated Games with Bounded Recall 

Aumann (1981) mentioned two ways of modeling a player with bounded ra- 
tionality: with finite automata and with bounded recall strategies. There are 
two alternatives to define strategies with bounded recall. The first one (see 
e.g. Aumann and Sorin, 1989) considers strategies with bounded recall which 
choose an action as a function of the recalled opponents’ actions, and the 
second alternative (see e.g. Kalai and Stanford, 1988, or Lehrer 1988) allows 
a player to rely on his opponents’ actions as well as on his own. The following 
are results on repeated games with bounded recall of the second type which 
are closely related to those presented for finite automata. Let BR'{m) denote 
all strategies of player i in a repeated game that choose an action as a function 
of all players action in the last m stages. Each pure strategy cr* G BW{m) is 
thus represented by a function /* : ^ Ai and a fixed element, initial mem- 
ory, e = (ei, . . . ,e^) G A^\ for t > m , cr*(ai, . . .,at_i) = ■ • • ,at-i) 

and for t < m, cr*(ai, . . . , at_i) = /*(ct, • • • , • • • , cit-i)- Given a strat- 

egy cr* = (^j/*) ^ BW{m), the automaton {A^,e,f\g^) where g\x,y), 
X - (xi,...,Xm) e A'^ and y G A_*, equals (x 2 , . . . ,Xm,{P{x),y)), im- 
plements the strategy Thus each strategy in BR^{m) is implemented by 

an automaton of size or in symbols and identifying a strategy with its 

equivalence class, BR^{m) C E*(|A|”^). Given a fixed two-person zero-sum 
game G = (A,5,/i), we denote by Km, m 2 the value of the undiscounted in- 
finitely repeated game G where player i is restricted to mixed strategies with 
support in BR\rrii). Lehrer (1988) proves the following result which is related 
and has a spirit similar to the result of Ben-Porath (1986,1993). 

Theorem 8 (Lehrer, 1988). For every function m : IN IN with log m(n)/n 
— > 0 as n 00 , 

liminf > Val {G). 

Proof. Note that, by identifying a strategy with its equivalence class, 
BR?{m{n)) C YP{\A x . Let k : IN W with limn_^oo m{n)/k{n) = 0 

and limn_oo logfc(n)/n = 0. E.g., k = w? . Let X = Ai, A 2 , . . . , • • • 

be a random A:(n)-periodic sequence of actions, with X\,. . i.i.d and 
Xt an optimal strategy of player 1 in the one shot game. W.l.o.g. we as- 
sume that the support of Xt has at least two elements. Consider the mixed 
strategy cr^(k{n)) of player 1 which plays the realization of X. The proof of 
Theorem 1 shows that for any sequence of strategies r(n) G E^(|A x 
liminfn_oo ?"oo(<^^(^(?^)), ^^(?^)) > Val(G) as n — >■ 00 . (Note that a^{k(n)) ^ 
A{BR^(n))). It is thus sufficient to prove that the norm distance of a^(k{n)) 
from A{BR^{n)) tends to zero as n 00 , i.e., that for most realizations of X, 
the implied pure strategy is in BR^{n). Note that for any 0 < s < t < k{n) 
there are positive integers 5 ' and F with t <t' < f-f n — [n/3], and s < s' < s-\- 
n - [n/3] such that A 5 /+ 1 , . . . , Xs>^n/ 3 ], • • • , Xt>^[n/ 3 ] are independent 

and (A 5 + 1 , . . ., A 5 +n) = (At+i, . . . , At+n) only if (A 5 /+ 1 , . . . , A 5 /^.[„/ 3 ]) = 
(Xf/^i , . . . , At/_j_[„/ 3 ]). Indeed, if min{^ — s, 5 -f k(n) — t} > [n/3] set s' = s 
and t' = t] if f < 5 + [n/3] set s = s' and t' — s is the smallest multiple of 
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t — s which is > [n/3]; and if s + k{n) < t + [n/3] set t = i' and s' — t is 
the smallest multiple of s + k{n) — t which is > [n/3]. There is a constant 
0 < a < 1 that depends on the optimal strategy of player 1 in the one shot 
game, such that < ol. (e.g., if p is the probability vector as- 
sociated with the optimal strategy in the one shot game a = Therefore, 

Pr(Vz, 1 < i < [n/3], Xs»^i = Xt>^i) < Therefore 

Pr(3 s,f, 0 < s < ^ < k{n) s.t V i, 0 < z < n, Xs-\-i = Xt+i)) 

< -^n-.oo 0. 

Note that the strategy cr*(n) which is defined as the strategy a^{k(n)) 
conditional on {V s,^, 0 < s < t < fc(n),31 < i < n s-t-X^-i-i ^ is in 

A{BR^{n)) with d{a*{n)^a{k{n))) — 0 as n ^ oo, where d(cr*(n), or(Ar(n))) 
denotes the norm distance between the mixed strategies (viewed as distribu- 
tions) cr*(n) and a{k{n)). Therefore 

liminf Val(5i?i(n), x 5r(”)),/ioo) > Val (G). 

n— »-cx) 

which completes the proof. ■ 

The next result is an analog of Proposition 2. 

Proposition 5 For every 2-player 0-sum game H = (A, J5, h) there is a pos- 
itive constant K such that if m : JN ^ JN with n > Km{n), then for every n 
there exists a strategy a{n) G BR^{n) such that 

lim (inf{/it(<j(n), r) | r G BR\m{n)), t > exp(n)}) = Val(/f). 

n— ►oo 

An interesting straightforward corollary of this proposition is the following. 
Let G = (A,r) be an n-person game, A = Xi<i<nAi and r = (r*)i<i<n, and 
assume that mi{k) < ... < mn{k) . 

Corollary 2 There is a constant K such that if Kmi(k) < m 2 {k), then 
for every k there exist an (n — l)-tuple of strategies r(k) = (T 2 ,...,r„) G 
'Xi<i<nBR^{mi{k)) such that for any strategy a{k) G BR^{mi{k)), 

limsupr^(cr(fc), r(fc)) < max min } g(a®)r^(a^, a“^). 

k-*oo ?6A(Ai) a-iGA_i 

a*GA* 

The next result is an analog of Proposition 3. Assume that mi(fc) < 

... < mn{k) with limk^oo^ogmn{k)/mi{k) = 0. For every 1 < z < n 

denote by z;*(mi(A:), . . . , m„(Aj)), or v^(k) for short, the minimax payoff to 
player z, i.e., min.^-* max^i , r”*), where the min ranges over all G 

Xj:^iA{BR^ {rrij{k))) and the max ranges over all cr® G BR\rrii{k)). As in sec- 
tion 4, we denote by (3(z), or Q for short, the set of all probability measures 
on A-i whose marginal distribution on Xj^iAj is a product measure. 




249 



Proposition 6 If for a fixed player 1 < i < n, limjb_^oo = oo, 

then 

limsup?;*(Ar) < min max 7 ^(a“"*)r*(a*, a“*). 

k-*oo a^eAi 

a ^£A-i 

We state now a result which provides a partial answer to the asymptotic 
behavior of the set of equilibrium payoffs of repeated games with bounded 
recall. It is an analog of Proposition 4. 

Proposition 7 There is a constant K > 0 such that if mi{k) < 
limife_oo > Kmi{k), and lirrifc^oo = 

then 

hmmfE{G{BR\mi{k)), 5i?"(m„(fc)))) D F 

k—*oo 

The next proposition and conjecture address the advantage of an unre- 
stricted player over a player restricted to bounded recall strategies in finitely 
repeated 2-player 0-sum games. For a fixed two-person zero-sum game H = 
(A,B,h), we denote by short, the value of the finitely re- 
peated game D^), i.e. the value of the T-repeated game in which 

player 1 is restricted to use strategies in BR^{n) while player 2 can use any 
strategy in The following proposition asserts that if the duration T is 
shorter then some exponential function of n then the unrestricted player has 
no advantage. 

Proposition 8 There exists a constant K > 0 such that if T : IN IN 
satisfies T(n) < exp(/^n), then 

lim Vf = Val H. 

n-*-oo 

Proof. It is sufficient to prove the result in the case that any optimal strategy 
of player 1 in the one shot game is not pure. Let Xi, X2, ... be a sequence of 
i.i.d optimal strategies in the one shot game. The stochastic process Xi, . . . 
induces a strategy a G A{BR^{n)) as follows. For each realization of the 
random sequence define the initial memory e = (Xi, . . .,X„) and the action 
function f : {A x B)'^ t 4 is defined as follows: for every (ui, 61), . . . , (un, ^n) 
define the stopping time S as the smallest value of t such that (ui, . . . , On) = 
(Xt_n, • • • and define 

/((ai,6i), . . .,(an,6„)) = Xs- 

Note that the strategy induced by each realization of the random sequence 
Xi, . . . consists of a deterministic sequence (which enters eventually a cycle) 
of actions of player 1 and that the sequence is independent of the strategy of 
player 2. It is easy to verify that there is a positive constant K > 0 such that 

lim Fioha{at{a) = Xt^n Vt < expXn) = 0 



. . . < mn{k), 
: 0 for i > 1, 
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and therefore the norm distance between the strategies cr and cr^ goes to zero 
as n goes the infinity. As cr^ is an optimal strategy in the result follows. 

■ 

The conjecture below claims that there is an exponential function of n, 
exp{Kn) such that if the number of repeatition T is larger then exp(An), the 
values converge to the maxmin in pure strategies. 

Conjecture 4 There is a constant K such that ifT : IN ^ IN satisfies T(n) > 
exp{Kn), then 

lim = maxmin /i(a, 6). 

n-^oo ” aeA beB ' 

As BR^{n) C x jB|”) a positive answer to the second part of Con- 

jecture 2 provides also a positive answer to Conjecture 4. It is of interest to 
study the asymptotics of where T(n) is approximately a fixed expo- 

nential function of n. This would close the gap between Proposition 8 and 
Conjecture 4. Given a 2-person 0-sum game H = it will be inter- 

esting to find the largest (smallest) function u : (0, oo) iR (u : (0, oo) — > IR) 
such that if — a as n — ^ oo then 

u(a) < lim inf < limsupl/f^”^ < u. 

n-*-oo n^oo 

Proposition 8 asserts that there is a constant Ki > 0 such that u{Ki) = 
Val H and the conjecture claims that there is a constant K 2 with u{K 2 ) = 
maXaGA min^GB h{a^ b). It is interesting to find the sup and inf of A'l and K 2 
respectively. We conjecture that the two functions u and u are continuous 
with u = ufoi all values of a with the possible exception of one critical value. 
We do not exclude the possibility of the existence of a positive constant K 
such that u{K) = Val (H) and u{K) = max^^A mint^B h{a^ b). 

7 Variations and Extensions 

We have studied here some topics in the theory of repeated games with de- 
terministic automata. There are several variants of the concept of automata 
which merits study in the context of repeated games broadly conceived, i.e., 
including repeated games with incomplete information and stochastic games. 
The variations of the concept of an automaton are in several independent 
dimensions. E.g., we can allow transitions that depend on the actions of all 
players and also allow for probabilistic actions and/or transitions, and more- 
over we can consider transition and/or action functions which are time depen- 
dent. A full automaton for player i is a 4-tuple {M,qi,f^g) where the set of 
states M , the initial state qi E M and the action function f \ M Ai are 
as in a (standard) automaton , and the transition function g : M x A M 
specifies the next state as a function of the current state and the n-tuple 
of actions of all players. The strategy cr^ induced by a full automaton a= 
(M, gi,/, ^f) is defined naturally by a\ = /(^i) and for every ai,...,at_i 
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in A, cr*(ai, . . . , at_i) = f(Qt) where qt is defined inductively for ^ > 1 by 
qt(aij , . . ,at-i) = g{qt-i, at-i). Obviously, every strategy which is induced 
by a full automaton of size m is equivalent to a strategy induced by an automa- 
ton of size m. Therefore when the actions and transitions are deterministic, 
allowing transitions to depend on your own action is not affecting the equi- 
librium theory. However it does have implications in the study of subgame 
perfect equilibrium of repeated games (see e.g. Kalai and Stanford (1988) and 
Ben-Porath and Peleg (1987)). Kalai and Stanford (1988) show that given a 
pure strategy o’® of a player in the repeated game, the number of different 
strategies induced by it and any finite history (ai, . . .,Ot) equals the size of 
the full automaton that induces A mixed action automaton for player i 
is a 4-tuple (M,qi,f,g) where M is a finite set, qi E M is the initial state, 
f : M A(Aj) is a function specifying a mixed action as a function of the 
state, and g : M x A M is the transition function. Each mixed action 
automaton induces a behavioral strategy o’® as follows. a\ = f{qi). Define 
inductively qt{ai , . . .,at_i) = g{qt-i,at-i), and 

<rj(ai, . . . , at-i) - at-i)). 

Denote by Ep(mi) all equivalence classes of behavioral strategies which are 
induced by a mixed action automata of size rrii . Two mixed (or behavioral) 
strategies, o’® and r®, of player i are equivalent if for any N \ {i}-tuple of 
pure strategies (o’®, o’"®) and (r®,o’“®) induce the same distribution on 
the play of the repeated game. Note that E®(mi) C T,p{rrii) and that E^(l) \ 
A(U^_iE®(m)) / 0. A stationary behavioral strategy in a repeated game with 
complete information is induced by a mixed action automaton with one state. 
Given a behavioral strategy o’® = (<^t)t^i> number of equivalence classes of 
the behavioral strategies of the form (o’® | oi, . . . , at) where oi, . . . , ranges 
over all histories which are consistent with o’®, (i.e., (o’® | ai, . . . , cls){o>1^i) > 0 
for every s <t), equals the size of the smallest mixed action automaton that 
implements o’®. A probabilistic transition automaton is a 4-tuple (M, qi, /, g) 
where M is a finite set, q^M is the initial state, f : M Ai is the action, 
and g : M X A-i A(M). Each probabilistic transition automaton induces 
a mixed strategy o’® as follows. a{ = f{q). Then the automaton changes its 
states stochastically in the course of playing the repeated game. If its state 
in stage t is qt and the other players action in stage t is a^® the conditional 
probability of qt+i, given the past is g{qt,at^), and its action in stage t is f{qt)- 
Denote by EJ(mj) all equivalence classes of strategies which are induced by 
probabilistic transition automata of size m^. Note that E^(m) C T:\{m\Ai\). 

Repeated games with complete information. The theory of finitely 
or infinitely repeated 2-person 0-sum games with complete information and 
either mixed action or probabilistic transition automata is trivial and not of 
much interest. However, the asymptotic behavior of the set of equilibrium 
payoffs of n-person (n > 3) infinitely repeated or 2-person finitely repeated 
games with either mixed action or probabilistic transition automata is un- 
known and of interest. The difficulties in the study of equilibrium payoffs 
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of n-person infinitely repeated games is the asymptotics of the minmax pay- 
offs which is unknown. As for 2-player finitely repeated games with either 
mixed action or probabilistic transition automata, it seems that our con- 
structed equilibrium (Neyman 1995) in the finitely repeated games, remains 
an equilibrium in the game in which players are restricted to play with either 
mixed action or probabilistic automata with the same bounds. Note that as 
E*(m) is a proper subset of Dp(m) (and of E^(m)), the assertion that cr* G 
A(E^(mi)) X A(E^(m 2 )) is an equilibrium in ({1, 2}; Ep(mi), Ep(m 2 ); vt) (in 
({1, 2}; Ej(mi), Ef (m 2 ); vt)) is stronger than the cissertion that it is an equi- 
librium in G^(mi,m 2 ). Moreover, in this setup, holding a player down to 
his individual rational payoff requires just one state (a fixed finite number of 
states) and therefore in the theorems there is only a need to bound the size 
of one of the automata. 

Repeated games with incomplete information. The theory of re- 
peated games with incomplete information and either probabilistic or deter- 
ministic action function is of interest. Here the initial state is allowed to 
be a function of the initial information, or equivalently, the initial move of 
nature is part of the input at stage 0. Alternatively, one allows the action 
function to depend on the state of the machine and the information about 
the state of nature. It is relatively easy to verify that in the case of 2-person 
0-sum repeated games with incomplete information on one side the value of 
the “restricted game” F(p, mi,m 2 ) converges to limvn(p) as m* — » 00 and 
(log max {mi, m 2 })/ min {mi, m 2 } 0. It is of interest to find whether in re- 

peated games with incomplete information on both sides and under the above 
asymptotic conditions on mi and m 2 the values of F(p, mi, m 2 ) converge to a 
limit and whether this limit equals limt;n(p)- 

Stochastic Games. My initial interest in the theory of repeated games 
with finite automata stemmed from my work with J.-F. Mertens on the exis- 
tence of a value in stochastic games. The 5-optimal strategies exhibited there 
are behavioral strategies which are not implemented by any finite state mixed 
action automaton. Blackwell and Ferguson (1968) show that in the “Big 
Match” there are no stationary (i.e. implemented by a mixed action automa- 
ton of size 1) e-optimal strategies, and it can be shown further that there are 
no £:-optimal strategies which are implemented by a mixture of mixed action 
automata of finite size. However, when both players are restricted to strategies 
that are implemented by either deterministic or mixed action automata of sizes 
mi and m 2 we are faced with a 2-person 0-sum game G(mi,m 2 ) in normal 
form which has a value F(mi,m 2 ). It is of interest to study the asymptotic 
behavior of V'(mi,m 2 ) as nii — > 00 . Consider the “Big Match” ( Blackwell 
and Ferguson 1968) which is an example of a 2-person 0-sum stochasic game. 



1 


0 


0 * 


1 * 
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The value of the (unrestricted) undiscounted game is 1/2. Mor Amitai 
(1989) showed that for this game there is a polynomial function m \ IN IN 
such that the value of the restricted “Big Match” where player 1 and player 2 
are resricted to strategies which are implemented by automata of sizes m(n) 
and n respectively converges as n — ^ oo to 1. 

Another generalization of automata suggested by the theory of stochastic 
games is a time dependent probabilistic action automaton. In a time depen- 
dent automaton the action of the automaton depends on its internal state 
and the stage. This generalizes also the concept of a Markov strategy. Black- 
well and Ferguson (1968) have shown that in the “Big Match” there are no 
5-optimal Markov strategies. This leads to the natural question as to whether 
or not there are 5-optimal strategies which are implemented by a finite state 
time dependent probabilistic automaton. When raising this question in a sem- 
inar in Stanford University, Jerry Green has pointed out to me the work of 
T. Cover (1969) which illustrates a statistical decision problem in which a 
difference between the stationary finite automata and the time dependent au- 
tomata emerges. My attention to the topic of repeated games with bounded 
automata was recalled in discussions I had with Alan Hoffman during his visit 
to Jerusalem in the spring of 1983. Hoffman informed me that in the early 
fifties, when engineers at SEAC were actually playing tick-tack-toe on the 
SEAC, he was concerned on how game theorists will view/study the fact that 
a 2-person 0-sum fair (value 0) game, becomes an unfair game when players 
are restricted by their “programs”. This triggered my attention to pose the 
problem settled by Ben-Porath and later to the study of the possible cooper- 
ation in finitely repeated games with bounded automata. I am indepted to 
each one of the above mentioned individuals for their influence, either directly 
or indirectly, on my working on repeated games with finite automata. 

Mor Amitai proved the following interesting result concerning the maxmin 
of stochastic games with probabilistic transition automata: for any stochastic 
game there exists a constant m such that for any mi and any strategy cr^ G 
Ej(mi) and £: > 0 there exists a strategy G S^(m) such that 

< sup inf (<7, r) + e 

where the sup ranges over all stationary strategies of player 1 and the inf 
ranges over all strategies of player 2. 
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Abstract. We present a selective survey of recent work on the Brown-Robinson 
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1. Introduction 

The notion of a Nash equilibrium plays a key role in modem economics. Indeed, it is 
only a slight exaggeration to say that it has become the sole basis of positive theory. 
Despite its central role, it has become clear that there are few reasons to warrant the use 
of equilibrium as a predictive tool. 

Theories of equilibrium may be divided into two broad categories. The first 
category consists of epistemic (or knowledge based) theories and the second of dynamic 
theories.^ 

This is intended to be the first of a two-part selective survey of some recent 
developments in the study of dynamical systems associated with games. In this paper 
we study a simple model of learning and in a companion paper we study a simple model 
of evolutionary dynamics. 

1,1 Epistemic Versus Dynamic Theories 

Epistemic theory explores the hypothesis that common knowledge of various aspects 
of the game will lead rational players, via a process of introspection, to play an 
equilibrium. However, the informational and rationality requirements needed to arrive 
at an equilibrium are extreme, and thus it is difficult to argue that such a theory provides 
a justification of the equilibrium idea for positive purposes. In particular, for players to 
reach an equilibrium requires not only that the rationality of the players and the payoffs 
be common knowledge, but also that the beliefs they hold about each other’s behavior 
be commonly known.^ 

^Binmore (1988) calls these the eductive and evolutive approaches, respectively. 

^Aumann and Brandenburger (1995) present a “state of the art” account of the epistemic theory. 
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Dynamic theories explore the hypothesis that equilibrium is reached via a process 
of gradual adjustment by boundedly rational players who encounter each other in a 
repeated setting. In contrast to the epistemic theory, the informational and rationality 
requirements of the dynamic theories are usually minimal. If anything, the dynamic 
theories are to be faulted for postulating behavior that is too naive. Nevertheless, Nash 
equilibria are rest points of most dynamic processes associated with games. The key 
question then is whether the particular dynamic process will converge to an equilibrium. 
Typical examples are the fictitious play learning process and the replicator dynamics 
from evolutionary biology. The former is the main subject of this survey and the latter 
the subject of an accompanying survey. 

1.2 General Versus Special Theories 

What can we learn from the two theories? At the most general level, the two theories 
make identical and rather weak predictions. Under the most plausible assumptions 
(common knowledge of rationality and payoffs), the epistemic theory implies that 
players will choose from the set of rationalizable outcomes.^ For general games, most 
dynamic theories also make exactly the same prediction: in the long run the choices of 
the players will be rationalizable. 

Is one approach superior to the other as the basis of Nash equilibrium? At a 
general level, the answer is no, and there is little to choose between the two theories. 
However, the dynamic theories can make sharper predictions in cases where the 
underlying processes can be shown to converge. Although certain well-known examples 
demonstrate that the convergence of these processes cannot be established in general, 
there are large classes of games with special structures for which convergence to an 
equilibrium can be guaranteed. Thus dynamic processes can form the basis of special 
theories. This is in marked contrast to the epistemic approach. There do not appear to 
exist interesting classes of games for which the predictions of the epistemic theory will 
be sharper than in general."* 

In this paper we study the special learning process known as fictitious play. This 
process is chosen for its inherent simplicity, intuitive appeal, and historical interest. 
Moreover, it appears that fictitious play shares many important structural features with 
more complicated learning processes. It seems quite likely that results obtained for the 
fictitious play process will hold for a more general class of processes. 

1.3 Experimental Evidence 

Before proceeding further, we wish to draw attention to some of the lessons that seem to 
be emerging from the now laige accumulation of experimental findings on the behavior 
of subjects in games. Smith (1990) suggests that (i) in a one-shot game, behavior is not 
well predicted by equilibrium; (ii) in a repeated, complete information setting players 
tend to “cooperate” and thus repeated game effects emerge, though they are hard to 
predict; and (iii) in a repeated setting with incomplete information, where players have 



^See Bemheim (1984) and Pearce (1984). 

Unless, of course, the set of rationalizable strategy combinations is a singleton. 
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knowledge of their own payoffs only, the best predictors of long-run behavior are the 
equilibria of the one-shot game with complete information. 

Since the environment in (iii) is exactly the environment that most learning processes 
postulate (see the next section), this emergent finding lends additional credence to the 
view that such processes are indeed worthy of study as the basis of a positive theory for 
games.^ 



2. Preliminaries 

Let G = {A^ B) be a two-player game where A and J5 are / x J matrices. We will 
refer to / = {1, 2, . . . , /} and J = {1, 2, . . . , J} as the sets of pure strategies available 
to players 1 and 2, respectively. As usual, if player 1 chooses strategy i and player 2 
chooses strategy j, the payoff to player 1 is aij, and the payoff to player 2 is bij. The 
sets of mixed strategies are denoted by A(7) and A( J), respectively. Let G A(7) 
be the mixed strategy that assigns weight 1 to i We will identify i with 6i and write 
i G A(7) instead of Si G A(J). 

For all q G A( J), let BR{q) be the set of pure strategy best responses for player 
1. The strategy pair {p*^q*) is a Nash equilibrium if {i : p* > 0} C BR{q*) and 
{j :q*>0}C BR{p*). 

Let B be a selection from BR\ that is, for all q, B{q) G BR{q), We assume 
that BR{q) = BR{q') implies B{q) = B{q'). Similarly, for all p G A(7), let 
B{p) G BR{p), 

2.1 Fictitious Play 

We consider the following dynamic process: At time t = 0, players randomly select pure 
strategies, i(0) , j (0) . At time t = 1 , player i chooses a strategy that is the best response 
to player/ 5 strategy in the previous period; thus, 2 (1) = B (j(0)) andj(l) = B (i(0)) . 
For all subsequent periods, player i chooses a strategy that is the best response to the 
history of strategies employed by player / treating the history as outcomes arising from 
an underlying mixed strategy employed by j. This is the “fictitious play” process first 
defined by Brown (1951). 



Definition 1 For t = 0, 1 , 2, . . . , sequence {p{t ) , q{t)) is a discrete time fictitious 

play process (DFP) if 

(p(0),g(0)) € A(7) X A(J); 



and, for all t>0, 
p{t + 1) 



tpjt) + B{q{t)) , IN _ 

i + 1 > ^ t + 1 



®See Boylan and El-Gamal (1993) on some experimental results on fictitious play and Roth and Erev (1995) 
for a survey of experimental work on learning in games. 
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Thus p{t -h 1) is a weighted average of p{t) and B{q{t)) where the weights are ^ 
and New strategies are chosen each “period”. Now suppose 5 > 0 is the time 
between adjustments; replacing the weights by ^ and we get: 

(,) 

or, equivalently: 

p{t + 6)- p(t) ^ B(q(t)) - p(t + S) 

6 t 

As 5 — » 0, we obtain that the right derivative of p (t): 
dpjt) ^ Biqjt)) -p{t) 
dt t ^ 

This is not defined for t = 0, so the continuous time version should start at some 
to > 0, say ^0 = 1- This leads to the following definition: 

Definition 2 For t > the path {p{t)^q{t)) is a continuous time fictitious play 
process (CFP) if 

(p(l),g(l))eA(/)xA(J); 

and 

dp{t) ^ B{q{t)) - p{t) dq{t) ^ B{p{t)) - q{t) 
dt t ’ dt t 

The system in (2) may be compactly written as: 

p = i [B{q) - p] , q=j [B{p) - q] (3) 

Proposition 1 Suppose (p, q) {p*,q*)- Then {p * , q*) is a Nash equilibrium of G. 

Proof. Suppose not. Then there exists a player, say 1, and a pure strategy i for player 
1 such that p* > 0, but i ^ BR{q*), Since q{t) q* there exists a T such that for all 
t > T,i ^ BR{q{t)), and thus i ^ B{q{t)). But this implies that limpi(t) = pt = 0, 
which is a contradiction. ■ 

2.2 Best Response Dynamics 

Definitions For s > 0, the path {x{s)^y{s)) is generated by best response 
dynamics (BRD) if 

(x(0),p(0))€A(/)xA(J); 

and 

= B{y{s)) - x{s), = ^(a;(s)) - y{s). (4) 

(See Hofbauer (1994) and references therein.) 
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Wfe now argue that the best response dynamics BRD is “equivalent” to the CFP in 
the sense that there is a one to one correspondence between the trajectories generated 
by BRD and the trajectories generated by CFP and one system is convergent if and only 
if the other is. 

Suppose that for all t > 1, (p(t), q{*)) is a CFP For all s > 0, define {x{s),y(s)) by 
x{s) = p(e®) and y{s) = q(e*). Thus we are using the transformation t = e‘.'V^ now 



obtain: 









= [S(g(e*)) -p(e*)] 



= B{y{s)) - x{s). 

A similar argument applies to y{s) so that {x{s),y{s)) is a BRD. 

Conversely, if for all s > 0,{x{s),y{s)) is a BRD then define for all f > 1, 
{p{t), q{t)) by p{t) = x(log t) and q{t) = y(log t). It is routine to verify that (j){t),q{t)) 
is a CFP 

This shows that the trajectory generated by a BRD with some initial condition 
(x(0),2/(0)) is isomorphic to the trajectory generated by a CFP with the same initial 
condition, that is, (p(l), q(l)) = (a:(0), y{0)) . 



2.3 Equivalent Games 



Definition 4 LetG = (A, B) and G' = {A', B') be two games with the same number 
of pure strategies for each player Let BR and BR' be the pure strategy best response 
correspondences for G and G' respectively. The games G and G' are said to be BR- 
equivalent, written as G G' , if BR = BR' . 

For our purposes, the importance of this definition lies in the fact the fictitious play 
process depends only on the best response correspondence of the game. Formally, let 
p{t) be a CFP for the game G and let p'{t) be a CFP for the game G'. If G ~ G' and 
p(l) = p'(i), then, for all t, p{t) = p'{t). The same is true for DFP also. 

Proposition 2 Suppose G are G' are two games satisfying: there exist a > 0 and 
P > 0 such that for all i, i',j, j' : 

a'^j—a^rj — a(aij—ai'j) 

^ij ~ Kj' ~ 0 ~ ^ij') ■ 

Then G ~ G' . 

Proof. Let q 6 A( J). For all i and i' we have that: 
j J 

^ {aij - ai.j) qj > 0 if and only if ^ (aL - qj>0 
j=i i=i 

and thus BRi = BR'j. Similarly, BR 2 — BI^. ■ 
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We now establish that every 2 x 2 game without weakly dominant strategies is BR- 
equivalent to either a zero-sum game or to a game with identical payoffs. 

Proposition 3 Suppose G = (A, B) is a 2 x 2 game without any weakly dominated 
strategies. Then there exists a 2 x 2 matrix C such that either (i) G r\j {C, -C)\ or (ii) 
G^{C,C). 

Proof. Without loss of generality, let an > 021 and ai2 < a22- Again, without loss of 
generality, let ail = ^11 = 0. 

Case (i)\Q=z bn < 612 and 621 > 622. 

By assumption {022 - ^12) - 021 > 0 and (622 - ^21) - ^12 < 0 . Thus there exists 
a /c < 0 such that {022 — U12) — U2i = A; [(622 — 621) — 612] . Now again without loss 
of generality, we can assume that A; = — 1. Define G to be the matrix: 



(N 

1 

0 

1 




0 


— 612 


021 («22 - ^12) — b \2 




021 


021 — (^22 — ^21) 



Using Proposition 2 it is easy to verify that (A, B) r\j iC-C). 

Case (ii) : 0 — bn > 612 and 621 < ^22* 

By assumption (a22 - ^12) - U2i > 0 and (622 - &21) - &12 > 0 . Thus there exists 
a A; > 0 such that {022 — ^12) — U2i = A; [(622 — b2i) — 612] • Now again without loss 
of generality, we can assume that A; = 1. Define C to be the matrix: 



0 612 




0 


^>12 


021 (022 — 012) + b\2 




021 


^^21 + (622 — ^21) 



Again, using Proposition 2 it is easy to verify that (A, B) rsj (c,c). m 



3. Lyapunov Functions 

Before proceeding, some results on asymptotic stability and Lyapunov ftinctions will 
prove helpful. 

Definition 5 Suppose x* is an equilibrium of the differential equation 

i = /W, ( 5 ) 

where f is a map. Then x* is an asymptotically stable equilibrium if, (i) for every 
neighborhood U of x"" , there is a neighborhood U\ of x* in U such that every solution 
x{t ) , with x( 0 ) inUi^ is defined and in U for all t > 0 ; and (ii) limt_,oo = x * . 

Lemma 1 Let x* be an equilibrium for ( 5 ). Let L :U -^^bea continuous function 
defined on a neighborhood U of x*, differentiable onU — a;*, such that 

L{x*) = 0 and L{x) > 0 ifxf^x*; 

L{x) <0 inU — X* 

then X* is asymptotically stable. 

Proof. See Hirsch and Smale ( 1974 ). ■ 
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4. Convergence Results 

In this section we study some special classes of games for which the CFP can be shown 
to converge. 

4.1 Zero-Sum Games 

Definition 6 The game G = B) is a zero-sum game ifB = —A. 

The convergence of the DFP in zero-sum games was established by Robinson (1951). 
Robinson’s proof is rather deep and difficult. However, a rather simple argument shows 
the convergence of the CFP in zero-sum games. ^ 

Theorem 1 Suppose G is a zero-sum game. Then every CFP converges. 

Proof. We show that every BRD process (defined in (4)) converges. This will then 
imply that every CFP converges (see section 2.2). 

For {x, y) € A(/) x A( J) define 

L{x^y) = maxxAy — mmyAy = B{y)Ay — xAB{x) 

X y 

Observe that for all (a:, y ) , L {x, y) > 0. Furthermore, L{x, y) = 0 if and only if (x, y) 
is an equilibrium of the game. We will argue that L is a Lyapunov function for the BRD 
process. 

Observe that by the Envelope Theorem, we have J- (max^ xAy) = B{x)Ay. 
Differentiating with respect to t we obtain: 

L = ^ (maxx xAy) - ^ (min^ xAy) 

= B{y)Ay — xAB{x) 

= B{y)A [B{x) -y]- [B{y) - x] AB{x) 

= -B{y)Ay -h xAB{x) 

= -L{x,y) 

< 0 . 



®The simple proof given below was first shown to us by Christopher Harris. Brown (1951) seems to be aware 
of this proof. See Hofbauer (1994) also. 
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P 




Figure 1: An example of non-conveigence 

as long as (a:, y) is not an equilibrium. Thus BRD is asymptotically stable by Lemma 
1 . This completes the proof.^ ■ 

Corollary 1 The rate of convergence of CFP in zero-sum games is O (1/0 . 

4.2 An Example of Non-convergence 

Shapley (1964) showed that the fictitious play process need not converge in general 
non-zero-sum games. We present a slight simplification of Shapley’s example. 
Consider the following game (also known as “Rock-Paper-Scissors”). 





R 


P 


s 


R 


O 

o 


—1, a 


a, —1 


P 


a, —1 


o 

o 


—1, a 


S 


-1, a 


a, —1 


0, 0 



where a > 0. This game has a unique equilibrium at p* = g* = . 

Suppose O' = 1. The best response correspondence for this game is depicted in the 
figure above. For all p in the region marked P (resp. 5, R), it is a best-response to play 
P (resp. S', R). Suppose that both players start with the same initial beliefs about each 

^Clearly we are glossing over some details since there are points where the use of the envelope theorem is 
invalid and L is not differentiable. However, these complications can be taken care of without too much 
difficulty. See Hofbauer (1994) or Harris (1995). 
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Other, that is, p(0) = g(0). Then, by the symmetry of the game, we will have that for all 
Letx = y = and 2 ; = Ifp(O) = a;, then 

as drawn, the CFP trajectory will cycle in the following manner: x y z x and 
will not converge to the unique equilibrium.* 

We now turn to two special classes of non-zero sum games for which the fictitious 
play process can be shown to converge. 

4.3 Potential Games 



Definition 7 A game G = (A, B) is a game with identical payoffs ifB = A. 

Monderer and Shapley (1996a) and (1996b) have introduced the following class of 
games. 



Definition 8 A game G = (A, B) is a potential game if there exists a matrix 
P = [jpij ) such that for all z, i', f: 

^i'j ~ ^ij ~ Pi'j ~~ Pij 

bijt — bij = Pij/ —Pij> 

G is a weighted potential game if there exist a > 0, /? > 0 and a matrix P = (pij) 
such that for all z, z', j, f: 

0,1/ j Oij = OL {pi'j Pij ) 

bij/ — bij = (3 {pij/ — Pij) . 



P is called the potential (resp. weighted potential ) for the game G, By Proposition 
2, every weighted potential game G is BR-equivalent to a game with identical payoffs 
G' = (P,P). 



Theorem 2 Suppose G is a game with identical payoffs. Then every CFP converges. 

Proof. Once again, we argue that the BRD process (defined in (4)) converges. This will 
then imply that every CFP converges (see section 2.2). 

For (x,y) G A(/) x A(J) define 

M{x^y) = xAy. 



® Foster and Young (1995) have recently contructed an ingenious example of a game of coordination in which 
the FP process also cycles but fails to converge. 
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Differentiating with respect to t we obtain: 

^ 

= xAy 4- xAy 

= xA [B{x) -y] + [B{y) - x] Ay 
> 0 

and M is strictly positive unless (x, y) is an equilibrium. 

Define M* = M{x,y) starting from some initial condition, say 

(^(0), 1/(0)) . If L{x, y) = M* - M{x, y), then L is a Lyapunov function. 

This completes the proof. ■ 



From Theorem 2 and Proposition 2 we immediately obtain: 

Theorem 3 Suppose G is a weighted potential game. Then every CFP converges. 

4.4 2x2 Games 

Recall, from Proposition 3, that every 2x2 game is J5i?-equlvalent to either a zero- 
sum game or a game with identical payoffs. Now from Theorems 1 and 2 we obtain: 

Theorem 4 Suppose G is2 x 2 game. Then every CFP converges. 



The convergence of the DFP in 2 x 2 games was established by Miyasawa (1961). 
Recently, Metrick and Polak (1994) have provided a geometric proof of Miyasawa’s 
result. 



4.5 Games with Strategic Complementarities 

Definition 9 The game G = (yl, B) satisfies strategic complementarities (SC) if, 
for all i <i' and j < f: 

^ j and (fi'j' j'} ^ * 

This class of games was introduced by Topkis (1979) and the properties of such 
games have been studied by Vives (1990) and Milgrom and Roberts (1990). In 
particular, such games always have a pure strategy equilibrium. Furthermore, if there 
is a unique equilibrium then the game is dominance solvable, that is, iterated removal 
of strongly dominated strategies will result in the equilibrium configuration. Thus if a 
game satisfying strategic complementarities has a unique equilibrium every CFP will 
converge. We now examine the behavior of CFP with multiple equilibria. 
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Denote by B{q) the largest element of BR{q). The set BR~^{q) = 

{q : i € BR{q)} is convex and notice that B~^{i) = {q : i = B{q)} is also convex. 
Let denote the partial order on A(7) signifying first-order stochastic dominance, that 
is, p" p' if for all z', 

Erf s Epi- 

i<i' i<i' 

The same symbol will denote the partial order on A( J). 

Lemma 2 Suppose that G satisfies strategic complementarities. If q^' X q', then 
B{q”) > BWl 

Proof. Let i' B{q') and z" = B{q"). 

For all z < z', i^i'j “ ^ij) Qj > 0* 

Since for all z < z', {oi^j — Oij) is a non-decreasing function of j, and q" y q', for 
all z <C z , ^ ^if) Qj ^ 0* 

Thus, for all z < z' , ai^jq'J > ^ aijq'l. 

Hence, B{q'^) = z" > z' = B{q'). m 

Consider two initial conditions, (p(l), ^(1)) and (p'(l), g'(l)) and the corresponding 
CFP trajectories {p{t)^q{t)) and {p'{t),q'{t)) . By Lemma 2 if (p(l),g(l)) >- 

(p'(l), g'(l)) then for all {p{t),q{t)) y \p^{t),f{t)) . Thus in games with strategic 
complementarities, CFP is a monotone system as defined by Hirsch (1988). 

Definition 10 The game G = (A, B) satisfies diminishing marginal returns (DMR) 
ifforalli.j: 

~ ^ij) < ~ {bij^i — bij) < {bij — bij^i) . 



The condition of diminishing marginal returns implies that for all g, BR~^{q) can 
cdnsist of at most two elements and that these two elements must be consecutively 
numbered strategies for player 1. That is, if BR~^{q) is not a singleton then = 

{z, z 4- 1} for some i e I. 

We now present a convergence result for games with strategic complementarities and 
diminishing returns due to Krishna (1992). 

Theorem 5 Suppose G satisfies strategic complementarities and diminishing 
marginal returns. Then every CFP converges. 

Proof. (Sketch) Let Q be the set of limit points of q[t). Let B{Q) = {z : z = J5(g), 
g G Q} andzi = min B{Q). 

Case 1. For all z < zi, limsuppi(t) = 0 (and hence lim pi{t) = 0). 

If B{Q) = {zi}, then lim pi^(f) = landp(t) converges. Henceg(t) also converges. 
So suppose that there exists an z G B{Q) such that z > zi. Then it is the case that 
the trajectory q{t) traverses the boundary between the sets B~^{ii 4- 1) and B~^{ii) 
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2 




Figure 2: Games with SC and DMR 

infinitely often. Consider a time T such that the trajectory q{t) makes the transition 
from 4- 1) to 

Suppose, as a simplification, that min B{Q) = = 1 (as depicted in the 

accompanying figure). For all t in some interval (T,T -f e) q{t) G J5”^(l) and thus 
i{t) = 1. Thus, for all t G (T,T + e), p{t) is a declining sequence, that is, for all 
t', G (T, T -f 6), we have that p{T) y p{t') y p{t'*). Lemma 2 implies that 

foralU G (T,T-f e), j(t) isalsoadecliningsequence, thatis, foralU',t" G (T,r+e), 
t' < t” we have that jf(T) > j(t') > (In the figure, j{T) = 2). 

It can be established that because of diminishing returns, for all t G (T,T -f e), 
B{j (t)) < B{q{t)). Thus the trajectory q{t) cannot leave 5“^ (1) after T, contradicting 
the assumption that there is an z > 1 such that i G B{Q). 1 

If > 1 then it is no longer possible to argue that p(T) p{t') y However, 
an indirect argument shows that we still have j{T) > j{t’) > (Krishna 1992). 
Case 2. limsuppij_i(t) > 0. 

Since ii = min B{Q), it is certainly the case that for all i < i\ limsuppi(t) = 0. 
Since the transition from B~^{ii)toB~^{ii — l) occurs infinitely often, then using the 
same argument as in case 1 it can be argued that B~^{ii — 1) will absorb the trajectory 
q{ty ■ 



We know of no examples with strategic complementarities for which CFP does not 
converge. Thus we make the following conjecture. 
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Conjecture 1 Suppose G satisfies strategic complementarities. Then every CFP 
converges. 



5. Mixed Strategies 

We have identified two classes of non-zero sum games for which CFP processes are 
convergent: potential games and games satisfying SC and DMR. Games in these classes 
share the important feature that CFP typically converges to a pure strategy equilibrium. 
Are there classes of (non-zero sum) games for which CFP can be shown to converge to 
mixed strategy equilibrium? 

A CFP is said to be cyclical if there is a finite sequence of K pure strategy 
combinations, say (ii, ji), (^ 2 ,^ 2 ), (^ 3 , ja), •••, Jk), such that the pure strategies 

played along the CFP follow this sequence over and over. We refer to this sequence as 
a K-cycle if each player uses exactly k distinct pure strategies. A K-cycle is said to be 
robust if there is an open set of initial conditions {p (1) , q (1)) such that the resulting 
CFP follows this cycle. 

Krishna and Sjbstrdm (1995) have shown the following. 

Theorem 6 For almost all games, if a robust K-cycle converges then k <2. 

The result shows that cyclical convergence of CFP to mixed strategy equilibria is a 
rare event. (The 2x2 exception is a consequence of Proposition 3.) Whether there are 
non-cyclical CFPs remains an open question. 



6. Convergence of Payoffs 

The notion of convergence we have used throughout is that the beliefs generated by 
the CFP converge to equilibrium beliefs. Consider the following simple game of pure 
coordination. 

H T 

H 1,1 0,0 

T 0,0 1,1 

Let p = Ph and q = qn- This game has three equilibria: (0,0), (1, 1) or (|, ^) • 
Consider the discrete time process DFP when the weights on p{t) and B{q{t)) are 
^ and respectively as defined in (1). If (p(0),g(0)) = (1,0) then (p{t),q{t)) 
converges to (^, ^) . But for all t, {i{t)J{t)) is either (T, ff) or {H,T) and so the 
actual payoffs in each period are 0 to both players while the equilibrium payoffs are 
Now consider the CFP If (p(l), q{l)) = (1, 0) then for all t G [1, 2) we have that 
{i{t)jj{t)) = (T, ff) and thus (p(2),g(2)) = (|, . However, the CFP is not well 

defined at the point (| , ^) although it is natural to postulate that the CFP is a limit of the 
DFP as 6 approaches 0. Thus in both the DFP and the CFP, while the beliefs converge to 
the equilibrium beliefs, the average of the actual payoffs is not the same as the expected 
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payoffs in equilibrium. Of course, this can only happen when the convergence is to a 
mixed strategy equilibrium. 

Consider the following result due to Monderer, Samet and Sela (1994). 



Theorem 7 Suppose that the CFP {p, q) (p* , q * ) and for all t, {p,q) ^ (p*,q*). 
Then the limit of the average of the accumulated payoffs from the path {i{t),j{t)) is the 
same as the expected payoff in equilibrium. 



Proof. Let a(f) = max.ppAq{t) = eny^Aqft) denote player I’s expected payoff in 
period t, where en^t) is the i(t)th unit vector. By definition, the actual payoff obtained 
in period t is ~ Then. 

a = CiAq 



= hiA [B{p) - q] 



and thus: 

which implies that: 



Thus, we have that: 



or that: 



= \eiA [ej - q] 

= \ [eiAcj - CiAq] 



tfOL — OL 



[itoi] e{Aej ttij. 

ta{t) = c + I 
C 1 

^(0 “ / ^i{s)J(s)ds 

Taking limits we obtain that: 

p^Aq* = maXppAg* 



= limt_oo (maXppAg(t)) 



which completes the proof. 



limt_^oo OL{t) 

limt_^oo 1 J ^i{s),j{s)d>S. 



It can be shown that for zero-sum games the actual payoff always converges to the 
expected payoff. This is true even for the DFP (Riviere (1993), Monderer, Samet and 
Sela (1994)). 
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7. Continuous \^rsus Discrete Time 

It should be apparent that there is much to be gained from considering the continuous 
time version of fictitious play (CFP). It leads to the development of strong convergence 
results whose proofs are simpler and more direct than those of their counterparts for the 
discrete process (DFP).^ 

We propose the following. 

Conjecture 2 Suppose that, for some game G, every CFP is convergent. Then every 
DFP is convergent. 



8. Conclusion 

While salient in many ways, fictitious play is hardly the only interesting learning 
or evolutionary mechanism. A small and very incomplete sample of other proposed 
schemes that have been studied may be found in Fudenbeig and Kreps (1988), Kandori, 
Mailath and Rob (1993) and Young (1993). 

Throughout this survey the questions have been tightly focussed around the fictitious 
play process and around special classes of games. We believe that this focus affords two 
advantages. First, it allows many questions to be posed in a precise manner. Second, 
fictitious play seems to capture many important features associated with learning. It 
is hoped that the results will be generalizable to other classes of games and to other 
learning schemes. Thus, the approach espoused in this survey has been to concentrate 
on specific classes of games with a view to generating strong results. 

Expressing similar sentiments, a mathematician (John Casti quoted by M. Hirsch) 
has observed that: 

“All current indications point toward the conclusion that seeking a com- 
pletely general theory of nonlinear systems is somewhat akin to the search for 
the Holy Grail: a relatively harmless activity full of many pleasant surprises and 
mild disappointments, but ultimately unrewarding. A far more profitable path 
to follow is to concentrate upon special classes of nonlinear problems, usually 
motivated by applications, and to use the structure inherent in these classes as a 
guide to useful (i.e., applicable) results.” 
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® Andreu Mas-Colell first suggested to us that the continuous time system may be simpler to work with than 
the discrete time system. 
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Abstract. We present a selective survey of work on evolutionary models of 
dynamics in games. We focus on the continuous time replicator dynamics and report 
convergence results for specific classes of games. 



Keywords. Equilibrium, evolution, replicator dynamics, dynamical systems. 



1. Introduction 

This paper is a selective survey of work on a dynamic evolutionary model for games: 
the replicator dynamics introduced by Taylor and Jonker (1978). It is intended to be 
the second of a two-part survey on dynamical systems associated with games. Our 
main focus is on convergence results for specific classes of games. A companion piece 
surveys work on the Brown-Robinson learning process known as fictitious play, and in 
this paper our purpose is to compare the results obtained for fictitious play with those 
for the replicator dynamics. 

Detailed accounts of evolutionary dynamical systems, including replicator 
dynamics, may be found in Hofbauer and Sigmund (1988) and Weibull (1995). 



2. Preliminaries 

Suppose that there is a single, infinite population of animals with types i = 1, 

An individual animal’s type i is simply a pure strategy which is genetically encoded 
and that the individual always plays. A single play of the game is viewed as a pairwise 
encounter between, say, animals with types i and j. The expected number of offspring 
of the animal of type i is then aij ; and the expected number of offspring of the animal 
of type j is aji. Let A = [aij) be a fitness matrix of size / x /. In game theoretic terms, 
there is an underlying symmetric game G = (A, B), where A is a square matrix and 
B = A^. Animals are matched randomly in each encounter and thus if in the current 
population the proportion of types is given by the vector p = (pi,P 2 , G A (/) , 
the overall expected fitness of type i is eiAp where ei is the ith unit vector. Behavior 
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which increases the animal’s fitness will be selected and behavior is assumed to be 
inherited via asexual reproduction. Thus if type i does well relative to other types, in 
the next generation the proportion of type i animals will be larger. 



2.1 Replicator Dynamics 



Wfe consider the following dynamic process: Let N{t) denote the total number of 
animals in the population at time t. Let Ni{t) denote the number of animals of type 
i in the population at time t. If n represents the per period death rate, then 

Ni{t + 1) = Ni{t) [1 - ^ + eiAp{t)] 

where eiAp{t) is the expected number of offspring with type i, andp(t) is a A; x 1 vector 
with typical element pi(t) = Thus, 

N{t + 1) = N{t) [1 — p + p{t)Ap{t)\ . 

'Ws may then write: 



Pi{t + 1 ) — 



Ni{t + 1 ) 



1 - p + eiAp{t) 



which yields: 



' _ (f) ^ ^ ' 

N{t + 1) ’l-p + p{t)Ap{t) 
eiAp{t) - p{t)Ap{t) 



Pi{t + l)-pi{t)=pi{t) 



1 - p + p{t)Ap{t) 

If we let the distance between time periods be denoted by 6 and divide by 6 we obtain: 
Pi{t + 6) - pi{t) _ ^ ejApjt) - pjt)Ap{t) 



Taking the limit as 6 



6 

0 yields: 



6p + p(t)hAp{t) 



Pi{i) = Pi{i) [eiAp{t) - p{t)Ap{t)] 
(1) may be written more compactly as: 



( 1 ) 



Pi = Pi \eiAp - pAp)] 



Given an initial condition p(0) » 0, (1) generates continuous time replicator dynamics 
(CRD) for the population. Observe that we may decompose (1) into eiAp{t), which 
represents the fitness of type i, and p{t)Ap{t), which represents the average fitness of 
the population. Thus, in CRD, all strategies continue to exist forever. This differs from 
the continuous time fictitious play (CFP) dynamics where only strategies that are best 
responses are present ast-* oo. 

The following property of (1) is called the quotient rule: 



dt 




^ \eiAp - CjAp] 

Pj 



which says that strategy i grows faster than j if and only if it has a higher expected 
payoff. 

It is easy to argue that, like the CFR if CRD converges then its limit must be an 
equilibrium. 
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Theorem 1 Ifp{Q) » 0 and p{t) — > p* then p* is an equilibrium. 



Proof. Suppose p* is not an equilibrium. Then there exists a pure strategy i such that 
BiAp* — p*Ap* = 26 > 0. By continuity, there is a neighborhood P of p* such that 
eiAq - qAq > 6 for all q € P. Since p{t) converges to p*, p(t) e P for all t > to. 
Furthermore, since Pi(0) > 0, for all t, pi{t) > 0. In fact. 



Pijt) 

Pi{t) 

Therefore, Pi (t) > pi(0)e*‘ 



= eiAp(t) - p{t)Ap{t) > 6, for all t 
— > oo, contradicting p(t) € P. ■ 



3. Evolutionary Stable Strategies 

Maynard Smith (1982) introduced the following definition. 

Definition 1 A mixed strategy (set of types) p* is an Evolutionarily Stable Strategy 
(ESS) if for any p f p*, there exists e' > 0 such that for all s G (0,e') : 
p*A ((1 — e)p* + ep) > pA ((1 — e)p* + ep) . 

The notion of an ESS is intended to capture the idea that the strategy p* is immune 
to small invasion by another strategy p. It is easy to verify that p* is an ESS if and only 
if for all pfp*'. 

(i) p*Ap* > pAp* ^2) 

(ii) if p*Ap* = pAp* then p*Ap > pAp. 

Thus for symmetric games, an ESS is a refinement of the set of Nash equilibria. In 
particular every ESS is an undominated equilibrium. Not every game has an ESS. 

Theorem 2 Suppose p* :$> 0 is an ESS. Then every CRD converges top*. 

Proof. First, observe that since p* > 0, by the definition of an equilibrium we have 
that for all p, pAp* = p*Ap* and hence by (2), for all pfp*. 

p*Ap > pAp (3) 

Next, define the function Z : -4 R by: 

For all p ^ p* , we have: 

Z{p)-Z{p*) - 



i 




< 
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= InE 



Pi 



= 0 



where we have used Jensen’s inequality. Thus Z has a strict maximum at p*. 
Now observe that: 

Zip) = 

i Pi 



= '^Pil^i-^P-P-M 



= p*Ap — pAp 
> 0 

as long dsp ^ p*, by using (3). Let L(p) = Z{p*) - Z{p), then L is a Lyapunov 
function; hence p* is an asymptotically stable equilibrium. ■ 



3.1 Equivalent Games 

Suppose that A and A! are two fitness matrices such that for all i, i', j : 

3 ~ ^i' 2 0>ij. 

Then the replicator dynamics for A' is the same as the replicator dynamics for A. To see 
this, write = aij -f Cj. The replicator dynamics for the fitness matrix A' is: 

Pi = Pi [eiA'p - pA'p)] 



Pi 



[Ej a'ijPj - Efc Ej a-'kjPkPj 
Pi [Ej («y + Cj)Pj - Efc Ej («ii + Cj)PkPj] 

^ijPj ~~ Ylk ^kjPkPj + CjPj — ^f^Pk Ylj ^jPj 

Sj ^ijPj ~ Ylk ^kjPkPj 



Pi 

Pi 



= Pi [eiAp - pAp)] 

and this is the same as the replicator dynamics for the fitness matrix A. 



4. Convergence Results 

In this section we examine the convergence properties of the replicator dynamics. It is 
useful to begin by considering an example. 
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4.1 An Example 



Consider the following game, also known as “Rock-Scissors-Paper”: 



Rock 

Scissors 

Paper 



Rock Scissors Paper 
1 a 0 

0 1 a 

a 0 1 



where a > 1. This game has a unique equilibrium p* = 5). According to 

Theorem 1, if the orbits converge, they converge to the unique symmetric equilibrium 
p*. For any p, 



pAp = IpI^ -f a (pip2 + P2P3 -f P1P3) = IpI^ + 



2 



The replicator dynamics are: 

Pi = Fi bi + ^P2 - pAp] 



i>2 = P2 b2 + (^P3 - pAp] 



P3 = P3[0iPl+P3-pAp] 



Consider the function V{p) = P1P2P3 defined on the simplex. F = 0 on 
the boundary of the simplex, and V attains its maximum at p*. compute the time 
derivative oflnF : 






^ ^ ^ 

Pi P2 P3 



= bi + ^P2 - pAp] 4- b2 + ap3 - pAp] -f- [api + P3 ~ pAp] 
= 1 -f a — 3pAp 

= (f-l) [3|p|"-l] 

Since 3 \pf > 1 for p 7^ p’^, we have: 

{ > 0 if a > 2 
= 0ifa = 2 
< 0 if a < 2 

Thus, we have that if a < 2 , CRD does not converge. 



4.2 Zero-Sum Games 



Definition 2 The game G = (A, B) is a symmetric zero-sum game ifB = —A = 

We observe that the CRD does not converge in general for symmetric zero-sum 
games. More formally: 
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Theorem 3 Suppose that G is a (symmetric) zero-sum game with a completely mixed 
equilibrium p*. Then every CRD starting from p(0) > 0, p(0) ^ p* does not converge 
to p*. 



Proof. Suppose p* is a completely mixed equilibrium. Since the matrix A is skew- 
symmetric, that is, = -A, its value is 0 and by definition, we have that for all 
i : 

0 = p*Ap* = eiAp^ = p^A^ei = -p*Aei. 

Hence we have that for all i, p*Aei = 0 and thus for all p : 

p*Ap = 0 

We also have that pAp = (pAp)^ = pA^p = —pAp and thus for all p : 

pAp = 0 



Now define the function: 

y{p) = Pi In Pi 



Suppose p^Q. Then 

V{p) = EiP,*| 

= EiPi [e«>lp-P^p] 

= p* Ap — pAp 
= 0 

Therefore the orbits of CRD lie on the level curves of V and hence do not converge 
top*. ■ 

4.3 Potential Games 

Ronald Fisher first considered games in which the two players have identical payoffs, 
that is A^ = A, and such games play an important role in evolutionary biology. It 
is useful to think of a game as being played by bits of DNA, called alleles, which are 
competing for a place on some gene locus. Alleles may be thought of as “candidates 
for genes.” Each allele is identified as a strategy and an animal consists of a pair [i,j) 
of alleles, which is known as its genotype. Since the two alleles belong to one animal, 
the phenotype, their survival and reproduction must be the same. Thus aij represents 
the joint fitness of the pair {i,j) and we have that CLij — Qjj i . 



Theorem 4 Suppose that G is a (symmetric) game with identical payoffs. Then every 
CRD converges. 
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Proof. Define V{p) = pAp. We have: 

V [p) = pAp -h pAp 



since A is symmetric. 
Now: 

hnp) 



= 2pAp 



pAp 






jiiPi 


(eiAp) 




YliPi 


[ciAp - 


pAp\ {eiAp) 


EiPi 


{ciApf 


- (EiPi {^iAp)) 


T,iPi 


[eiAp - 


pApf 



> 0 

and is strictly positive unless p is an equilibrium. Define F * = limt_cx) V {p) starting 
from some initial condition p(0) . Let L{p) = V* — V (p), then L is a Lyapunov function; 
hence limt_oo p{t) is an asymptotically stable equilibrium. ■ 



The following result is called the “Fundamental Theorem of Natural Selection” and 
is attributed to R. Fisher. 



Corollary 1 (The Fundamental Theorem of Natural Selection) Average fitness is 
increasing. 



4.3.1 Gradient Fields 

In this subsection we establish “Kimura’s Maximum Principle”: The change in gene 
frequencies occurs in such a way that the increase in average fitness is maximal. 

The average fitness in the gene pool is pAp and the increase in average fitness is 
maximal is gene frequencies move in the direction of the gradient of pAp^ that is, 2Ap. 
Thus the dynamical system that maximizes the increase in average fitness is: 

p = 2Ap (4) 

But the CRD is not the same as (4). 

A formal treatment of Kimura’s principle requires the introduction of a new metric 
on that is different from the Euclidean metric. 
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For any p G consider the inner product (• , )p on the tangent space at p defined 
by*: 

i 

Given a function V (p), there exists a unique vector grad V (p) in the tangent space 
at p called the gradient of V at p such that for all ^ in the tangent space at p: 

{gr^dV{p),z)^ = '£^^^Zi 

Lemma 1 The gradient corresponding to the S-inner product is: 

(gradV(!,)),=p, 

Proof, \brify that (grad V (p)) • = 0. 

Now: 

(grad V (p), z)p = ^ ^ (grad V (p))^ Zi 

^ Ei " 

Definition 3 A dynamical system p = /(p) is said to be a gradient field with 
potential function V{p) if f{p) = grad F(p). 

Since grad V (p) is the direction of steepest increase of V, Kimura’s maximum 
principle is a consequence of the following result. 

Theorem 5 The CRD is a S-gradient vector field with potential function \pAp. 

Proof. Consider V{p) = \pAp. Then ^ = SiAp and YjPj§^ = The 
replicator dynamics is: 

Pi = pi{eiAp-pAp) 

Thus fi-om the previous lemma the CRD can be rewritten as: 

p = gradF(p). ■ 

^The tangent space at p is the set of vectors v such that YliVi = 0. 
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4.4 2x2 Games 

For 2x2 games we prove the result that the orbits converge if they start in the interior. 



Theorem 6 In non-degenerate 2x2 games, if p(0) 0 then p{t) converges to an 

equilibrium. 

Proof. Without loss of generality, assume a \2 = a 2 i = 0, and ana 22 ^ 0 (non- 
degeneracy). We need to consider only strategy one; thus, we have: 

= Pi (Pi^ii - {Piau + (1 - Pi)^a22)) 



= Pi(l -pi) {piau -f (1 -pi)(-a22)) 
There are four cases to consider: 

Case 1. ail > 0 > ^ 22 - Thenpi(^) > 0 for all t and limpi{t) = 1. 
Case 2. an <0 < 022 - Thenpi(t) < 0 for all t and limpi{t) = 0. 
Case 3. an > 0, a 22 > 0. Then 




>0foralU>0ifpi(0)>^^ 
< 0 for alH > 0 ifpi(O) < 



thus, Pi (f) ^ 1 ifpi(O) > and Pi (t) -> Oifpi(O) < 

Case 4. an < 0, a 22 < 0. Then 



thus, pi(t) 




>0forallt>0ifpi(0)<^^ 
< 0 for alH > 0 ifpi(O) > 



011 ^+ 022 * limits are equilibria by Theorem 1. ■ 



4.5 Games with Strategic Complementarities 

The behavior of CRD in symmetric games with strategic complementarities (see the 
previous chapter for a definition) remains an open problem. 



5. Convergence of the Time Average 

In many instances even if the trajectory p{t) does not converge, it is possible to show 
convergence of the “time average” of p{t). 



Definition 4 



The time average of the trajectory p{t) at time T is 



m{T) = ^ r p{t)dt 



Definition 5 The strategy i is permanent if there exists a e > G such that for all 
Pi{t) > e? 



^Our definition of permanent strategies is different from that in Hofbauer and Sigmund (1988). 
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Theorem 7 If every strategy is permanent, then every limit point of the sequence of 
time averages of CRD, m{T)^ is an equilibrium. 

Proof. Observe that 

y [Inpi(T) - Inpi(O)] = dlnpi{t) 

J 0 

= ^ / {eiAp — pAp)dt foralH 

J 0 

thus, ^ 

i [In Pi (T) - In Pi (0)] = eiAm{T)-^f pApdt 

J 0 

= ^eiA [ pdt — ^ f pApdt 
Jo Jo 

Suppose that along a convergent subsequence m(T^) of m(T), m{T^) p. Then, 



for all i : 



eiAp = lim^^oo eiAm(T^) 



lim^^oo (Inpi(T^) - Inpi(O)) -f ^ J pApdt 

/ J,S 

pApdt 

0 



= lim. 



since, by assumption, Pi{T^) stays away from the boundary of the simplex. Observe 
that the final expression is independent of z; hence, p is an equilibrium. ■ 



Corollary 2 In zero-sum games where p* 0, limT_oo = P* • 

Proof. By Theorem 3 CRD trajectory lies on the level curves of the function 

V{p) = In Pi 

i 

Since no level curve of this fiinction can intersect the boundary of every pure 
strategy is permanent. Now the result follows immediately from Theorem 7. ■ 




285 



6. Conclusions 

We have compared the convergence properties of the continuous time fictitious play 
process (CFP) with those of the continuous time replicator dynamics (CRD). Our focus 
has been to identify specific classes of games in which convei^ence of the processes 
may be guaranteed. The results may be summarized in the following table. 



Typeof game 


CFP 


CRD 


Zero-sum 


Convergence 


Non-convergence 
m(T) converges 


2x2 


Convergence 


Convergence 


Potential 


Convergence 


Convergence 


SC + DMR 


Convergence 


Open Problem 



We end with an interesting conjecture due to Gaunersdorfer and Hofbauer (1994): 

Conjecture 1 CFP converges if and only if the time average of CRD, m{T)^ 
converges. 
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PART D 



DESCRIPTIVE THEORY 





Descriptive Approaches to Cooperation 



Reinhard Selten^ 

Wirtschaftstheorie I, University Bonn, Adenauerallee 24-42, D-53113 Bonn, 
Germany 



1. Introduction 

There are three types of decision and game theory: 
ideal-nonnative theory, 
prescriptive theory, and 
descriptive theory. 

In ideal normative game theory one assumes fully rational players and often also 
common knowledge of full rationality. The point of interest of ideal-normative 
game theory is the strategic behavior under these conditions. The assumptions 
are not realistic, but nevertheless, ideal normative game theory is an important 
intellectual pursuit. The consequences of ideal normative rationality are of great 
philosophical significance. 

Prescriptive game theory asks the question what a player should do if the 
participants in the game are not fully rational. For prescriptive theory one 
caimot take a completely Bayesian view because Bayesian decision making needs 
probabilities and utilities as inputs. If people are not able to come up with 
consistent utilities and probabilities then Bayesian methods are not directly 
applicable to their decision problems. 

Descriptive game theory is not concerned with the question how players should 
act, but how they actually do act. This lecture will be concerned with descriptive 
game theory only. 

Experimental observations strongly suggest that human players in games 
usually 

do not optimize, 

do not have utility functions, and 
do not form probability distributions. 

Thus the question arises: what do players do instead? 



‘ This paper is based on a written account of lectures given at the NATO 
Advanced Study Institute on Cooperation: Game Theoretic Approaches 
1994 in Stony Brook, prepared by Bettina Kuon. 
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Descriptive game theory is still in its infancy but nevertheless much more 
could be said about it than is possible in this lecture. We shall restrict ourselves 
to topics directly related to cooperation and even within this restriction no claim 
of completeness is made. 



2. The equity principle 

The equity principle is a very simple principle which is of great practical 
importance. Homans (1961) emphasized it in his writings, but probably it has 
been spelled out more or less clearly in the literature much earlier. The equity 
principle is applicable to situations where something has to distributed. 

A player i receives a share Xj which is determined according to the weight Wj 
of player i. The shares are measured by a standard of distribution. A standard 
of comparison determines the weights of the players. This terminology has been 
introduced in a paper by Selten (1988). The equity principle proposes a distribu- 
tion such that the shares are proportional to the weights. This means 

^ ^ 

Wj w^ w/ 

A context in which the equity principle is often applied is cartel formation. 
Consider the fictitious example of a cartel of three firms which wants to restrict 
output to 600. The standard of distribution is production quantity. The firms 
have to decide on a quota up to which they are permitted to produce. If the 
standard of comparison is the capacity of the firms then the quotas shown in 
table 2.1. emerge. 

Table 2.1. Cartel formation 



Firm 


Capacity 


Quota 


I 


500 


300 


II 


300 


180 


III 


200 


120 



In actual praxis most of the bargaining is about the standards of comparison. 
However, there are some requirements these standards have to satisfy: the 
standards of distribution and comparison have to be observable and relevant. 
Observability means that everybody can clearly see the basis of computing 
shares and weights. Relevance means that the standards are not arbitrary but 
closely related to the nature of the problem. 
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In another fictitious example for the equity principle we look at the construc- 
tion of a common river water purification plant by three towns. The total cost 
of the plant is 40 million $. Suppose that the towns select the number of inhabit- 
ants as the standard of comparison. Table 2.2 shows the resulting costs for each 
town. 

Table 2.2. River water purification plant 



Town 


Inhabitants 


Million $ 


A 


90,000 


18 


B 


70,000 


14 


c 


40,000 


8 



In this example the standard of distribution is the cost to be carried. Instead of 
the number of inhabitants another standard of comparison could be used, e.g. 
water consumption in the last three years. Water consumption may be consid- 
ered to be more relevant than the number of inhabitants and may therefore have 
a better chance to be adopted in the bargaining process. 



3. Equity and coalitional bargaining in three-person character- 
istic function games 

In a three-person characteristic function game, in this section often simply called 
a game, three players, named 1,2, and 3, can form coalitions. A coalition C is 
a non-empty subset of the set of all players. Each coalition C has a value v(C) 
which the members can distribute among themselves. The value of the grand 
coalition including all three players will be denoted by v(123)=g. The coalition 
of players 1 and 2 has the value v(12)=a, the coalition of players 1 and 3 has 
the value v(13)=b and the coalition of players 2 and 3 has the value v(23)=c. 
Without loss of generality assume a numbering of the players such that a>b >c. 
We shall restrict our attention to zero-normalized games, which means that a 
player i which is not member of a coalition receives v(i)=dj=0 (i = 1, 2, 3). The 
function v which assigns values to coalitions is called the characteristic function 
of the game. 

A coalition with more than one member is called a genuine coalition. We also 
shall look at games in which not all genuine coalitions are permissible. The set 
of all permissible genuine coalitions is denoted by Q. A three-person game is 
called superadditive, if all genuine coalitions are permissible which means 
Q = {12,13,23,123} and if in addition to this we have 

v(CljD) > v(C) + v(D) for CflD=0, 
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where C and D may be any two non-intersecting coalitions, not necessarily 
genuine ones. 

A coalition structure is a partition of the set of players. In a three-person 
game at most one genuine coalition can be formed. Therefore, a coalition 
structure can be described by the genuine coalition which is formed or by the 
absence of any genuine coalition. There are five coalition structures: the grand 
coalition 123, the three two-person coalitions 12, 13, and 23, and the "null- 
structure" that no coalition is formed indicated by 

A payoff configuration is a coalition structure together with a payoff distribu- 
tion with the property that each coalition in the partition fully distributes its 
values among its members in such a way that everybody receives at least his 
one-person payoff dj. This means the payoff configurations can have the follow- 
ing form 

(-;dj,d 2 ,d 3 ) if no genuine coalition is formed 

(C;x,,X 2 ,X 3 ) with CGQ and 

and x.=d. for i^C and x >d for i E C. 

A grid game is a game which involves a smallest money unit 7 . The values of 
the characteristic function are integer multiples of 7 . In grid games only payoff 
configurations are permitted which specily payoffs which are integer multiples 
of 7 . This has the consequence that grid games have only a finite number of 
configurations. Usually, games played in experiments are grid games. In the 
following we shall consider only grid games, shortly referred to as games. 

An area theory predicts a subset of the set of all configurations for every 
game in a specified class of games. Normative game theory has produced quite 
a number of area theories. The core and the Aumann-Maschler bargaining set 
(Aumann and Maschler 1965) seem to be the most important ones. Point theo- 
ries like the Shapley value (Shapley 1953) or the nucleolus (Schmeidler 1969) 
are an alternative to area theories. However, in view of the great variance 
usually shown by the results of game experiments point theories do not seem to 
be appropriate for descriptive purposes. Ideally, one would want a theory which 
predicts the distribution of the results, but the test of such theories requires 
many more data than are available at the moment. 

3.1 The bargaining set 

The bargaining set is of special importance for this lecture since it has been pro- 
posed not only as a normative but also as a descriptive theory (Maschler 1978). 
In the following we shall not give a definition of the bargaining set. It will only 
be explained which payoff configurations are predicted by the bargaining set for 
the case of the superadditive zero-normalized three-person game. As we shall 
see certain numbers called quotas are of special significance for this theory. The 
quotas of players 1,2, and 3 are the numbers qj, q 2 , and q 3 , resp., which solve 
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the following set of equations: 

Qi + Q2 = a 

q, + Q3 = b 

qj + Qs = c • 



This means: 



qi 



a+b-c 

2 



% 



a+c-b 

2 



^3 



b+c-a 

~ 



A game is called a quota game if qj>0 for i=l, 2, 3, which is equivalent to 
b+c>a. For each of the five coalition structures table 3.1 shows which payoff 
configurations are predicted by the bargaining set. 

Table 3.1. The bargaining set for zero-normalized three-person games 



Coalition 

structure 


b H- c > a quota games 


b+c<a 


2g<a-l-b-l-c 


2g>aH-b+c non-empty core 


- 


(-;0,0,0) 


12 


(12;q„q2,0) 


(12 ;Xi,X 2,0) with 
Xj>b and X2>c 


13 


(13;q„0,q3) 


(13;b,0,0) 


23 


(23;0,q2,q3) 


(23;0,c,0) 


123 


(123;x,,X 2,X3) with 
Xi= qi-(qi+q2+q3-gV3 


(123;Xi,X 2,X3) with core 

Xi+X2>a and Xi+X3>b and X2+X3>c 



In table 3.1. a case distinction is made between quota games and other games. 
In addition to this, one has to distinguish between games with empty and non- 
empty core. The core is non-empty if and only if 2g>a-l-b-l'C. 

If the core is non-empty the bargaining set predicts the configurations in the 
core for the coalition structure 123. If the core is empty then the game must be 
a quota game and the bargaining set predicts equal distances of payoffs from the 
quotas for the coalition structure 123. In quota games configurations for two- 
person coalitions predicted by the bargaining set split the value according to the 
quotas. If the game is not a quota game then the bargaining set predicts that in 
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coalition 12 player 1 receives at least b and player 2 receives at least c; in 
coalition 13 player 1 receives the whole coalition value b and in coalition 23 
player 2 receives the whole coalition value c. The bargaining set does not 
exclude the possibility that no genuine coalition is formed. The only configura- 
tion for this coalition structure is also predicted. In games without the grand 
coalition, in which two-person coalitions are permissible but not the three-person 
coalition, the predictions for the two-person coalitions and the null-structure are 
the same. 

Example 

Consider the following zero-normalized three-person game: 
g = 110, a=100, b = 80, and c=70. 

This game has the quotas 
qi=55, q2=45, and q3=25. 

The bargaining set in this example is 

(-; 0 , 0 , 0 ) 

(12;55,45,0), (13;55,0,25), (23;0,45,25) 

(123;50,40,20). 



3.1.1 The united bargaining set 

Maschler (1978) proposed to consider certain transformations of the game 
instead of the original one: the power transformations. These transformations 
can be interpreted as an application of equity considerations. 

v,(C) = v(C) + ^[g-v(C)-v(N\0] 

v,(C) = v(C) + I^[g-v(C)-v(N\C)] 

2 

V3(ij) = max[Vj(ij),_g] 

V3(i) = g - V3(jk) i,j,kG {1,2,3}, i?fj?fk?^i. 

Here, | C | is the number of members of C. 

Vi(C) can be interpreted as the equal split of the surplus transformation. The 
surplus of the value g of the grand coalition over the sum of the values of C and 
its complement N\C is split evenly between C and its complement. The transfor- 
mation V 2 (C) splits the same surplus proportionally to the numbers | C | and 
I N\C I of members of both coalitions. We may say that Vj and V 2 both use the 
same surplus as the standard of distribution but different standards of compari- 
son. In the case of Vj both coalitions receive the same weight, whereas in the 
case of V 2 the weight of a genuine coalition is the number of its members. 
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The transformation of V 3 is called the Maschler power. It can be interpreted 
based on the idea that a two-person coalition can either rely on Vi or instead of 
this treat the game as if only the three-person coalition were permissible; this 
means the two players i and j declare that they refuse to enter any two-person 
coalition and insist on their equal shares in the three-person coalition. 

A problem of the power transformation is that the one-person values can 
add up to more than g: 

Vi(l) + Vi( 2 ) + Vi(3) > g forg>a+b-fc. 

A problem of the other two power transformations is that dummies in v fail to 
be dummies in V 2 and V 3 . A dummy in a superadditive characteristic function 
game is a player i with 

v(C)-v(C\i)=v(i) for every permissible genuine coalition with iEC. 

The power transformations allow to define the power bargaining sets. Let B 
be the ordinary bargaining set for v and let be the set of all configurations in 
the bargaining set for v,^ (m= 1,2,3) which are also configurations for v. 
Maschler proposes not only to consider the configurations in B as predictions 
but also those in B^, with m= 1,2,3. 

The power bargaining sets as well as the original bargaining set do not exclude 
the null-structure. However, the null-structure very rarely is observed in experi- 
ments. Therefore, the predictive success of bargaining set theory can be consid- 
erably improved by excluding the null-structure. Accordingly, we define the 
bargaining set Bq and Bo^ as the bargaining sets B and B,^, respectively, without 
the configuration involving the null-structure. Bq is called the bargaining set 
without the null-structure', and the Bq^ are called power bargaining sets without 
the null-structure. 

Maschler (1978) observed that in his experiments players would have a ten- 
dency to agree on payoffs divisible by 5. Therefore, he proposed to neglect 
deviations smaller or equal to 5. As we shall see later the number 5 has the 
significance of a "prominence level" which determines the dividing line between 
reasonably round numbers and other numbers in the perception of the players. 
The prominence level is 5 in Maschler’s experiments but in other experiments 
with coalition payoffs in a different range it may assume other values. In order 
to do justice to prominence in this sense we define the bargaining set Bq[A] as 
the set of all configurations (C;Xi,X 2 ,X 3 ) such that a configuration 
(C;yi,y 2 ,y 3 )EBo with |yi-Xjl <A for i = l, 2, 3 can be found. Analogously, 
Bom[A] is the set of all configurations (C;Xi,X 2 ,X 3 ) such that a configuration 
(C;yi,y 2 ,y 3 )EBom with |yi-Xj| <A for i = l, 2, 3 can be found. Usually, we will 
have A =5. The sets Bq[A] and Bq[A], resp., will be called bargaining sets and 
power bargaining sets without the null-structure and with deviations up to A. 

Now the united bargaining set can be defined as the union of the three bar- 
gaining sets without null-structure and with deviations up to A 

U[A] = B,[A] U B,JA] U B,,[A]. 

The Maschler power bargaining set Bq 3 [A] is not considered by this definition 
since its inclusion does not improve predictive success. 




296 



3.2 The experiment of Mumighan and Roth 

Mumighan and Roth (1977) conducted an experiment on the following zero- 
normalized three-person game: 
g = 100, a=100, b = 100, the coalition 23 is not permitted. 

All other genuine coalitions are permitted. They observed 412 plays in which 
two-person coalitions were formed. Figure 3.1. shows the frequency distribution 
over the share of player 1. For the numbers 50, 55, ..., divisible by 5, the 
figure shows the number of cases in which player 1 received this amount. The 
cases with payoffs for player 1 between such two amounts are aggregated to a 
single category. Thus, the bar between the bars for 50 and 55 shows the fre- 
quency of payoffs for player 1 between 50 and 55. 

A strong tendency to allocations divisible by 5 is observable. 




Figure 3.1. The experiment of Mumighan and Roth 

The predictions of the different bargaining set theories are shown in table 3.2. 
In addition to this, the second last row shows the performance of the simple 
theory that player 1 gets at least 50. This theory is suggested by equity consider- 
ations. If both players in a permissible two-person coalition were equally strong 
both should receive 50. Since it is obvious that player 1 is stronger than the 
other player he should expect a payoff of at least 50. Since he is more powerful 
there are good reasons to suppose that he should get more than 50, but since it 
is very difficult to say how much more, 50 is the lower limit of what player 1 
should receive. 
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Table 3.2. Bargaining set predictions for the game by Murnighan and Roth 



Theory 


Predicted range 


Number of 
cases 


Percentage of 
correct predictions 


Bo[5] 


95 < Xi < 100 


16 


3.9 


B,[5] 


70 < Xi < 80 


113 


27.4 


B2[5]=B3[5] 


61.67 < X, < 71.66 


103 


25.0 


U[5] 


95 < Xj < 100 or 
61.67 < Xj < 80 


188 


45.6 




50 < Xi < 100 


399 


96.8 


0 < Xi < 100 


412 


100 



The table shows that the predictions of the various bargaining sets are not very 
accurate. The united bargaining set U[5] predicts correctly in 45.6% of the cases 
only. The simple theory that player 1 receives at least 50 predicts correctly in 
96.8% of the cases. Of course, this simple theory predicts a greater range than 
the united bargaining set and it is easier to achieve more correct predictions with 
a greater range. However, even if the right kind of correction for range size 
differences is made, the simple theory is much more successful than the united 
bargaining set or each of its components. The way in which range sizes should 
be taken into account in comparisons between area theories will be discussed in 
section 3.5. 

It is very interesting to see how well the simple theory agrees with the data. 
The idea that the stronger player in a two-person coalition should get at least his 
equal share of the value seems to be a better predictor of behavior than the 
sophisticated bargaining set concept. The simple theory combines equity and 
power considerations. It does not say that the strong player must receive his 
equal share as payoff, but rather that, in view of his power position, the equal 
share is a lower bound for his payoff. 



3.3 The equal division payoff bounds 

This theory was introduced by Selten (1987) for three-person characteristic 
function games. The theory of equal division payoff bounds describes a hypo- 
thetical thought process. Unlike other area theories the theory of equal division 
payoff bounds is not based on a notion of stability. It does not exhibit the typical 
circularity of normative game theoretic concepts, but describes a finite sequence 
of reasoning steps leading to lower bounds for the players’ payoffs. For the sake 
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of simplicity the theory of equal division payoff bounds will only be presented 
for superadditive three-person games. It will however be clear how the theory 
has to be adjusted to the more general case. 

A subject looking at the game structure immediately perceives power 
differences among the players which can be described by an order of strength. 

Table 3.3. Order of strength for zero-normalized three-person games 



inequality 


a > b > c 


a > b = c 


a == b > c 


wem 


order of strength 


\ >2>3 


1 ^ 2 > 3 


1 > 2 ^ 3 





Table 3.3. shows the order of strength of the players 1,2, and 3 for zero- 
normalized three-person games. The sign means "stronger" and the sign 
" ~ " stands for "equally strong". 

The order of strength is quite obvious even for naive subjects without any 
knowledge of game theory. Thus, if b>c then 1 is stronger than 2 since player 
1 can get b with player 3 whereas 2 can get only c with player 3 . 

The theory of equal division payoff bounds first focusses its attention on the 
strongest player and deduces what this player should expect to receive. Then it 
looks at the second strongest player, and so on. These considerations determine 
bounds (the tentative bounds), which limit the amount a player could expect. 

The first tentative bound is the coalition share v(C)/ 1 C | for iGC, if no other 
member of C is stronger than i. Thus, for b>c in coalition 12 player 1 has the 
coalition share a/2, but player 2 does not have the coalition share a/2 in coali- 
tion 12 because player 1 is stronger. 

For player 2 the substitution share is defined as (a--b)/2, which is half of the 
surplus of coalition 12 over coalition 13. The idea is that player 2 can replace 
player 3 in 13 and therefore can claim half of the surplus a-b. Neither for 
player 1 nor for player 3 a similar substitution share needs to be defined. Player 
1 could replace either 2 or 3 in coalition 23 but the share in the surplus would 
be at most a/2, player I’s coalition share in 12. Player 3 cannot produce a 
surplus by replacing another player in a two-person coalition. 

Player i's completion share (g-v(jk))/3 is based on the idea that player i 
should get at least one third of the surplus of the grand coalition over the two- 
person coalition of the other two players. Obviously, there is a connection to the 
proportional surplus power transformation V 2 (see 3.1.1). 

The highest tentative bounds for 1 and 2 are 

t, = max[-^,.^] 

' 2 3 
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t2 



tj for b=c 






for b>c . 



For player 3 a competitive bound w is defined. Player 3 is a very weak player. 
He has to make high offers to 1 and/or 2 in order to avoid that coalition 1'2 is 
formed. Therefore, player 3 is interested in the highest amounts player 1 and 2 
can receive in coalition 12. These highest amounts are hi=a-t2 for player 1 and 
h2 = a-ti for player 2. The minimum of 3’s surpluses over hj and h2 is 3’s 
competitive bound: 

w = min[b— hi,c— h2]. 

The highest tentative bound for player 3 is the maximum of the completion 
share and the competitive bound 

t3 = max[.?^,w]. 

Various simple considerations of equity and power have lead to the definition 
of tentative bounds. For every player the highest tentative bound is a natural 
lower limit for his payoff aspirations. However, it may happen that in a game 
with g>a the sum of the highest tentative bounds ti+t2+t3 is greater than g. 
However, for g>a the players feel a strong urge to form the three-person 
coalition, specifically if the game is run under favorable communication condi- 
tions with face to face interaction. In order to make it possible to form a three- 
person coalition in spite of ti-f-t2+t3>g, at least one player must decrease his 
aspiration level to a value below his highest tentative bound. For a>b, which 
means that 2 is stronger than 3, player 3 as the weakest one has to yield by 
decreasing his aspiration level below t3. In the case a=b>c, where 2 and 3 are 
equally strong, player 1 is the one who yields since otherwise two players 
instead of only one would have to decrease their aspiration levels. In the case 
a=b=c the equal share g/3 in the grand coalition is the natural aspiration level 
for all three players. 

These considerations lead us from the highest tentative bounds to what we call 
preliminary bounds. The bounds are preliminary only in as far as the influence 
of the prominence level still has to be considered. The preliminary bounds pj, 
P2, and P3 are as follows: 

For ti+t2 4-t3<g: 

Pi = tj for i = l, 2, 3 
For ti+t2-f-t3>g: 
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Pi = ^1 P2 = ^2 P3 = g-a for a>b>c 
Pi = I P2 = P3 f®'' 

Pi = P2 = P3 = I for 2=^=0 . 

We always have Pi+P2+P3<g. 

The final bounds emerge from the preliminary bounds by taking into account 
the prominence level A and the smallest money unit 7 

P 

Uj = max[7,Aint^] where int x is the greatest integer not greater than x. 

The final bounds are reached by rounding the preliminary bounds to the next 
lower integer multiple of the prominence level A. However, if this rounding 
should lead to zero the final bounds will be one smallest money unit 7. 

The predictions of the equal division payoff bounds are that a genuine coali- 
tion C with v(C) > u. is formed, if possible, and that x^>\xJor iEC, if 
C is formed. The words "if possible" indicate that no prediction is made for 
those extreme cases in which no configurations exist with the properties required 
above. For example this is the case for a=b=c=g=7. 

Table 3 . 4 . gives an overview over the necessary calculations for the equal 
division payoff bounds. 
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Table 3.4. Equal division payoff bounds 



Tentative bounds for 1 and 2 

t, = max[|,|] 

tj for b=c 
for b>c 




hi=a-t2 and h2=a-ti 
w = min[b— hi,c— 112]. 





Pi = tj for i = l, 2 , 3 if ti+t2+t3<g or g=a. 

For ti+t2+t3>g: 

P, = t, P2 = P3 = g-a for a>b>c 
Pi = I P2 = P3 

Pi = P2 = P3 = I for a=b=c 

Final bounds 

u = max[7,Aint— ] where int x is the greatest integer not greater than x 

• A 
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3.4 Prominence 

Experiments have shown that players tend to select "round" numbers. In order 
to clarify this phenomenon a theory of prominence in the decimal system has 
been developed by Albers and Albers (1983). Let X be the set of all integer 
multiples of the smallest money unit 7 . The prominence level A EX is of the 
form 

A = where /z=l, 2, 5, 25 and r]=0, 1, 2, ... 

The prominence level of x is the greatest prominence level such that x is an 
integer multiple of A. 

The basic idea behind these definitions is a picture of the mental process which 
takes place if a person has to decide on a number such as a price to be set. As 
an illustrative example we shall look at the case of a person which has to guess 
the number of inhabitants of Islamabad. The first step in the process is the 
perception of a broad range in which the answer must lie, say, between 0 and 
20 million. Then the person looks at the midpoint of this range and asks himself 
whether the number of inhabitants is greater or smaller than 10 million. The 
process stops and ends with the answer 10 million if the person feels that there 
is no reason to answer one way or the other. Suppose, that the person decides 
that the number of inhabitants is smaller than 10 million. This narrows the range 
to the numbers between 0 and 10 million. Again, the person will look at the 
midpoint of the range, 5 million and consider the question whether the number 
of inhabitants is greater or smaller. The process stops with the estimate 5 
million if there are no good reasons to answer the question one way or the 
other. Suppose, that the person decides that the number of inhabitants is greater 
than 5 million. The midpoint of the remaining range between 5 and 10 million 
is 7.5 million. In this situation some persons may focus on 7.5 million but 
others on 7 or 8 million, since they perceive these numbers as "rounder". 
Whether 7.5 or 7 is considered as rounder may vary from person to person. 
Suppose, the subject focusses on 7 million and then decides that the number of 
inhabitants is smaller. He then may focus on 6 million and finally come to the 
conclusion that this is his estimate since he cannot find good reasons for decid- 
ing that the number to be guessed is greater or smaller. 

Basically, the process just outlined successively divides the range still consid- 
ered into two roughly equal parts and then focusses on the point separating the 
subintervals. This point may be different from the exact midpoint if otherwise 
one would obtain a too much "broken" number. Thus, an interval of the length 
5 • 10’'7 may be divided into two subintervals of lengths 2 • 10’^7 and 3 • 10 ^ 7 , 
rather than split evenly. The decimal system makes successive equal division of 
round intervals inconvenient since eventually a point will be reached where the 
result becomes too messy. Intervals of the length 2.5 • 10''7 may still be tolerat- 
ed but a further equal subdivision to subintervals of the length 1.25 • 10*^7 is too 
inconvenient. This has the consequence that at least one of the subintervals will 
always have a length of the form A = jid 0''7 where p = l, 2, 5, 25 and ry=0, 1, 
2,... . This is the motivation of the definition of the prominence level given 
above. 
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My ideas about the role of decimal prominence in decision making may not be 
exactly the same ones as those of Albers and Albers (1983), but they are not 
more than an elaboration of their basic picture. Albers is now in the process of 
developing a quite different view which, however, cannot be described here. 

For the purpose of the comparison of descriptive theories it is necessary to 
define a prominence level for a whole data set as an estimate of the dividing line 
separating sufficiently round numbers from other numbers. Table 3.5. illustrates 
the computation of the prominence level for the data set of Mumighan and Roth. 
In this experiment the smallest money unit has the value 7 = .01. 

Table 3.5. The prominence levels of player I’s share in the data of Mumighan and Roth 
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The first column of table 3.5. shows the prominence levels which appeared in 
the distribution of the shares of player 1, shown in figure 3.1. The second 
column shows the number of values observed at the concerning prominence 
level. Thus, for the prominence level of 25 only two values were observed, 
namely 25 and 75. The third column shows the number of observations at the 
concerning prominence level. In the case of the prominence level 25 there are 
24 observations. This means that 24 shares were either 25 or 75. For low levels 
of prominence only much fewer values are observed than there are numbers 
with this prominence level in the range from 0 to 100. 

The fourth column shows the ratio h(A)/m(A) of the number of observations 
h(A) to the number of values m(A). This ratio can be looked upon as the relative 
occupation of the values with the concerning prominence level. It can be seen 
that with some exceptions the relative occupation has a tendency to decrease 
with decreasing prominence level. Moreover, the relative occupation drastically 
moves down from A =5 to A =2.5. 

The fifth column shows the cumulative number of values M(A) and the sixth 
column shows the cumulative number of observations H(A). The number of all 
values observed in the data set is denoted by M and the number of all observa- 
tions is denoted by H. In table 3.5. we have M=64 and H=412. The next two 
columns show the relative cumulative number of values M(A)/M and the relative 
cumulative number of observations H(A)/H. The last column shows the differ- 
ence 

H M 

We call D(A) the relative cumulative occupation surplus or shortly the occupa- 
tion surplus. The prominence level of a data set is defined as the greatest maxi- 
mizer of the occupation surplus. In the case of table 3.5. the prominence level 
of the data set is 5. Already in figure 3.1. is was clearly visible that shares 
divisible by 5 have a much greater frequency than others. The method of com- 
putation confirms the visual impression that 5 is a reasonable value for the 
prominence level of the data set. 

The interpretation of this way of computing the prominence level of the data 
set becomes clear if we look at the special case that the relative occupation 
h(A)/m(A) is monotonically decreasing with decreasing prominence levels. In 
this case the maximum of D(A) is obtained at the prominence level A"” with the 
property that the relative occupation h(A)/m(A) is greater than the mean occupa- 
tion H/M if and only if A > A^ This means that the prominence level of the data 
set separates the prominence levels with more than average occupation from 
those with less than average occupation. 

As in table 3.5. the relative occupation h(A)/m(A) is not always decreasing 
with decreasing prominence levels. This is partly due to random influences in 
the case of prominence levels with a small number of observations h(A). How- 
ever, there are also other influences like the fact that often prominence levels of 
the form of 25 • Wy are less frequent than those of the form 20 • Wy (with 
the same rj in both cases). An individual does not really need both types of 




305 



prominence levels and may omit one in favor of the other. Thus, in table 3.5. 
the prominence level 25 has a relative occupation of 12, which is lower than the 
relative occupations of 20, 10, and 5. In spite of this lack of monotonicity the 
computation method for the determination of the prominence level of a data set 
still separates those prominence levels which on the whole have a higher relative 
occupation than average from those which on the whole have a lower one. 



3.5 The difference measure of predictive success 

Area theories differ with respect to the size of the predicted area. This has to be 
taken into account in the comparison of different area theories. We cannot 
simply look at the hit rate which is defined as the relative frequency of correct 
predictions. A correction for the relative size of the area has to be made. In the 
following the relative size of the predicted area within the set of all possible 
outcomes will simply be called the area. 

Selten and Krischker (1983) introduced the difference between hit rate and 
area as a measure of predictive success. In the following the hit rate will be 
denoted by r and the area by a. The difference measure by Selten and Krischker 
is 

m=r-a. 

At first glance it might seem to be arbitrary to measure predictive success by 
this difference rather than by another function of r and a, e.g. r/a. It has been 
argued by Selten (1991) that the simple alternatives to the difference measure 
have bad properties. E.g. if there is a unique most frequent outcome the mea- 
sure r/a favors the theory which predicts this single outcome only. All other 
theories have a lower ratio r/a. This means that a theory may be singled out by 
r/a even if it is almost always wrong. 

In the paper by Selten (1991) an axiomatic justification of the difference 
measure has been given. Let m(r,a) be a measure based on hit rate r and area a. 
The following five axioms characterize the difference measure up to increasing 
monotonic transformations. 

Axiom 1 (r-monotonicity): m(r,a)>m(r’,a) for r’<r 
Axiom 2 (a-monotonicity): m(r,a)>m(r,a’) for a’>a 
Axiom 3 (continuity): m(r,a) is continuous on [0,l]x[0,l] 

Axiom 4 (independence): if m(r’+r,a’-l-a)>m(r’,a’) then 

m(r” -hr,a” +a) > m(r” ,a”) 

Axiom 5 (indifference between trivial theories): m(0,0)=m(l,l) 

Axioms 1 to 3 hardly need any comment. Obviously, r-monotonicity and a- 
monotonicity should be satisfied and continuity is a reasonable requirement. 
Axiom 4 can be interpreted as follows. Suppose there are three theories T, T’, 
and T” such that the predicted area of T intersects neither that of T’ nor that of 
T”. Let the hit rates and areas be r, r’, r”, resp., and a, a’, a”, respectively. 
Consider the union TUT’ of the theories T and T’ in the sense that the union of 
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both areas is predicted. Suppose that TUT’ has a higher measure than T’. Then 
TUT”, understood in the same way, should have a higher measure than T”. 
This means that whether joining a theory T to another one with a non-intersect- 
ing predicted area is an improvement or not, depends only on T and not on the 
other theory. This seems to be a reasonable requirement. 

In the paper presenting the axiomatization (Selten 1991) axiom 4 is expressed 
in another way, but it can be seen easily that the mathematical content is the 
same. The intuitive justification given here is a different one. 

A theory which predicts nothing is very sharp but never accurate. A theory 
which predicts everything is always correct but absolutely undiscriminating. 
Obviously, both types of theories are equally useless. It seems to be reasonable 
to give the same measure to both. This is expressed by axiom 5. 

In the application of the measure to characteristic function games the problem 
arises how the relative size of the area should be computed. It is not adequate 
simply to determine the number of predicted configurations and to divide them 
by the number of all configurations. Doing this would put a too great emphasis 
on the grand coalition. There are 101 ways of dividing 100 smallest money units 
among two players but 5151 ways of dividing 100 smallest money units among 
three players. It seems to be adequate to give equal weights to all permissible 
coalition structures and within each structure equal weights to all configurations. 
This means that in the case of the zero-normalized three-person game with all 
genuine coalitions permissible each of the five coalition structures gets the 
weight 1/5. In a two-person coalition with the value 100 each configuration then 
receives the weight 1/505 and in a three-person coalition with a value of 100 the 
weight of a configuration is 1/25755. The area is computed as the sum of all 
weights of predicted configurations. 

Table 3.6. shows hit rates, areas, and success measures for four data sets and 
the theories Bq[ 5], U[5], and the equal division payoff bounds with A =5 (denot- 
ed by E 5 ). The table also shows significance levels for comparisons between 
theories according to the Wilcoxon signed-pairs matched-ranks test applied to 
success measures of independent subject groups. 

For the purpose of comparing U[5] and Bq[ 5] the last two data sets have been 
combined, since for each of them alone one obtains no significance. 

For all four data sets the equal division payoff bounds have the highest success 
measure and it is significantly better than the other two theories. 
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Table 3.6. Comparison of predictive success 



Theory 


Bo[5] 


U[5] 


E 3 




Experiment 


Significance 


hit rate 


.59 


.89 


.89 




U[5] 


E, 


area 


.19 


.20 


.13 


Maschler’s 27 plays 


1 

better than 


with grand coalition 






success 


.40 


.69 


.76 


Bo[5] 


U[5] 


hit rate 


.04 


.44 


.92 






' 


area 


.03 


.12 


.31 


Mumighan and Roth 


.0001 


.0001 










412 plays 






success 


.01 


.32 


.61 








hit rate 


.51 


.55 


.92 


Rapoport and Kahan 






area 


.08 


.08 


.18 


160 plays 




.01 


success 


.43 


.47 


.74 


8 quartets 


.05 
















hit rate 


.68 


.72 


.93 


Medlin 






area 


.09 


.09 


.20 


160 plays 




.05 


success 


.59 


.63 


.73 


8 quartets 







E5 denotes the equal division payoff bounds with A =5 

Uhlich (1989) compared the bargaining set and the equal division payoff bounds 
in 25 independent data sets and found a higher success measure for the equal 
division payoff bounds. Table 3.7. shows this result. 

Table 3.7. Uhlich’s 25 independent data sets 



Theory 


Success measure 


Bo[A] 


.1899 


E. 


.6357 



A is the prominence level of the data set 

Uhlich (1989) developed a theory of proportional poyojf bounds P^, which is a 
modification of E^. It is also applicable to three-person games which are not 
zero-normalized. For zero-normalized games E^ is slightly better than P^. 
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4. The negotiation agreement area 

The negotiation agreement area was first introduced in Uhlich (1989) and 
further tested and refined in Kuon and Uhlich (1993). It is a descriptive area 
theory for two-person games in which the conflict payoffs may be different from 
zero. 

Two players bargain over a constant sum of money v(12). The players alter- 
nate in proposing until they either agree or one player decides to terminate the 
negotiations. In case of conflict player i receives v(i). Without loss of generality, 
number the players such that v(l)>v(2), so that player 1 is the stronger player. 
The negotiation agreement area assumes that the strong player will start the 
bargaining with a high demand close to the surplus. 

A max ^ v(12)-v(2). 

The weak player, however, will start with a lower demand in the area between 
the equal split of the surplus in addition to v(2) and the whole surplus. 

A^max ^ v(i2)-v(l) and = v(2) + V2(v(12)-v(l)-v(2)). 

The negotiation agreement area assumes that the final agreement is reached by 
equal relative concessions from the initial demands. This assumption defines an 
area which specifies lower bounds Xj for the two players: 

A max A min 

X = ! v(12) Xj = 1 — ^v(12). 

An adjustment of these bounds to the prominence level of the data set leads to 
the final bounds Uj 

X 

u. = max[v(i)+7,Aint(^)] where A is the prominence level and 7 is the 
smallest money unit. 

Figure 4.1 graphically displays the idea of the negotiation agreement area. 




Figure 4.1. The negotiation agreement area 
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The negotiation agreement area was tested in 12 independent sessions of a 
laboratory experiment. It was compared to several other solution concepts. The 
best of the alternative theories was an interval around the equal split of the 
surplus in addition to the alternative values, bounded by the next prominent 
numbers (numbers having at least the prominence level of the data set). Howev- 
er, table 4.1. shows that the negotiation agreement area has a higher success 
measure than the equal split of the surplus interval. The difference between 
these theories is significant. 

Table 4.1. The negotiation agreement area (NAA) and the equal split of the surplus 
interval (ES) 



Theory 


Success measure 


NAA 


.4873 


ES 


.2639 



These results have been obtained by Kuon and Uhlich (1993). There, also games 
with negative conflict outcomes have been investigated. For such games the 
theory imposes the additional requirement that bargaining shares should not be 
negative. In the case of negative conflict outcomes the negotiation agreement 
area has a success measure of .7275 and is significantly better than the alterna- 
tive theories. 



5. The principle of balanced aspiration levels 

The principle of balanced aspiration levels has been discovered in the context of 
a macroeconomic decision game, where groups of students had to play the roles 
of an employers’ association, a workers’ union, and a central bank. One of the 
tasks was the wage bargaining between the employers’ association and the 
workers’ union. Before the bargaining the players had to answer questionnaires 
on what they will try to achieve. This experiment was conducted by Tietz and 
Weber (1972) and Tietz (1973). 

The employers’ association as well as the workers’ union had to specify the 
following numbers: 
first demand 

planned outcome (the outcome they plan to achieve) 

at least attainable outcome (the outcome they expect to be able to push 
through at least) 

conflict threat (at this point they threaten to strike or lockout) 
conflict limit (at this point they would rather strike or lockout than accept the 
proposal) 
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These five levels form an ordinal scale of potential aspiration levels. The static 
principle of balanced aspiration levels predicts that in terms of the ordinal scale 
the highest level achieved or surpassed by the final outcome will be the same 
one for both bargainers. This level is the highest common level. The determina- 
tion of the final outcome by the highest common level involves an interpersonal 
comparison between both players’ aspiration scales. Apparently, bargainers are 
guided by such comparisons. 

It turned out that the prediction of the static principle of balanced aspiration 
levels is very good. Tietz and Weber (1972) also proposed a theory, called the 
planning difference theory, which does not only predict the final outcome but 
also aspects of the bargaining process. In the next section we shall look at this 
semi-dynamic theory. 



6, The planning difference theory 

The following notation will be used for the five levels and for the expected first 
demand of the other player, which also had to be filled in in the preplay 
questionnaires: 
first demand F 
planned outcome P 
at least attainable outcome A 
conflict threat T 
conflict limit L 

expected first demand of the other player E 
The lower indices 1 and 2 will indicate variables specified in the questionnaire 
of the workers’ union and the employers’ association, respectively. 

Usually these numbers satisfy the inequalities 

Fi > Pi > Aj > Ti > Li 

for player 1 (the workers’ union), and 

^2 — — ^2 — ^2 ^ L 2 

for player 2 (the employers’ association). 

The planning difference is defined as P 1 -P 2 . If the planning difference is 
positive then the plans cannot be realized. The concession reserve is Fj-Ai for 
player 1 and A 2 -F 2 for player 2. The tacit concession is E 2 -F 1 for player 1 
and F 1 -E 2 for player 2. If the other bargaining side expected a higher first 
demand of the proposer than this can be seen as if the proposer already made a 
concession. 

We speak of an easy bargaining situation if Pi <P 2 . Then the player with the 
smaller tacit concession makes the first concession. On the other hand, we speak 
of a tough bargaining situation if Pi>P 2 - Then the player with the greater 
concession reserve makes the first concession. 

Let V be the highest common level reachable by both bargaining partners. 
Then the bargaining result is that one of the values Vj and V 2 which is more 
favorable for the first concession maker. 
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The planning difference theory involves a strategic stability problem. It seems 
to be advantageous to name a ridiculously high first demand. Here "high" must 
be understood in terms of the ordinal aspiration scale; for the workers’ union a 
high demand is a high wage increase and for the employers’ association a high 
demand is a low wage increase. The higher the first demand is in this sense, the 
lower is the tacit concession and the higher is the concession reserve. Therefore, 
the chance to be the first concession maker is increased by a more aggressive 
first demand. Since the first concession maker is favored by the final outcome, 
this has the consequence that for a given first demand of the other player it is 
always possible to obtain the more favorable outcome by a sufficiently aggres- 
sive first demand. If this were the case, then experience should drive players to 
more and more aggressive first demands. However, no such tendency is ob- 
served. Probably an excessive first demand becomes unbelievable and therefore 
does not have the predicted effect. The planning difference theory is unsatisfac- 
tory in this respect. 

A second problem raised by the planning difference theory is an information 
transmission problem. How is the private information on aspiration scales 
transmitted by bargaining? If the way in which bargaining proceeds determines 
the final outcome it is advantageous for a player to behave as if he had aspira- 
tion levels leading to a better final result than honest information transmission. 
Players should be able to learn this by experience. Thereby, the validity of the 
theory would be destroyed. 

In the experiment conducted by Tietz the players were represented by groups 
of students which interacted repeatedly for quite a number of periods. The 
situation had aspects of a supergame and the players may have taken advantage 
of this fact. Supergames facilitate cooperation and in the case at hand coopera- 
tion may take the form of not being too dishonest about the strength of one’s 
motivation. In repeated interaction dishonesty could not be maintained without 
inducing strong suspicions on the other side. This kind of cooperation may 
explain how the transmission problem is solved. 

Unfortunately, the answer to the transmission problem outlined above is 
probably not the whole story because the principle of balanced aspiration levels 
also seems to work in the studies with one shot bargaining situations by Scholz 
(1980) and Scholz, Fleischer, and Bentrup (1983). Maybe, the way in which 
people bargain is adapted to repeated bargaining situations, but not to one shot 
bargaining. The reasons for this could either lie in man’s evolutionary past or in 
everyday experience. Maybe, many subjects have learned their bargaining 
behavior mainly in repetitive interactions with the same people. 

In the next section the dynamic aspiration balance theory (Tietz 1976) will be 
described. Like the planning difference theory this more ambitious fully dynamic 
theory also faces the information transmission problem. However, simulations 
by Tietz, Daus, Lautsch, and Lotz (1988) indicate that the strategic stability 
problem does not arise in the dynamic aspiration balance theory. 
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7. The dynamic aspiration balance theory 

The dynamic aspiration balance theory aims at the prediction of the whole bar- 
gaining process. It has been developed by Tietz (1976). The theory first refines 
the scale of aspiration levels by introducing midpoint levels. 

F level 8 

i/ 2 (F+P) level 7 

P level 6 

L level 0 

The secured level is the highest level reached by the opponent’s last offer. The 
aspiration disadvantage is the number of levels by which the opponent’s secured 
level surpasses the own secured level. A normal concession is a concession by 
two level units. The theory determines who the first concession maker will be. 
This is done with the help of three successively applied criteria, called /z/rm. 
Filter 1: Who has the higher secured level? 

Filter 2: Who has the higher concession reserve Fj-Aj or A 2 —F 2 , resp. ? 
Filter 3: Who has the lower tacit concession E 2 ~Fj or F 2 -Ej, resp. ? 

The player which is selected by the highest filter is the first concession maker. 
If all filters are indecisive a random selection will determine the first concession 
maker. The strength of the first concession maker is 
3 if filter 2 or filter 3 would select the other party, 

2 if filter 3 is decisive, 

1 else. 

The strength determines the size of the first concession. The first concession is 
a normal concession unless (1), (2), or (3) is fulfilled. 

(1) The aspiration disadvantage would become greater than 3 or the strength of 
the concession maker is at least 2. Then the first concession maker will 
make a concession of 1 level unit. 

(2) The absolute difference between the first demands is very small (<.25% 
wage increase). Then the other player’s first demand is accepted. 

(3) A normal concession would go beyond the other player’s first demand. 
Then the first concession maker concedes to the midpoint between the first 
demands, or by 1% wage increase, whatever is smaller. 

From now on the players alternate in making concessions. The size of a 
counterconcession can be determined by the accountable preconcession. This is 
the amount in wage units by which the opponent’s last demand is more favorable 
than the conflict limit. The size of the first counterconcession is the accountable 
preconcession multiplied by the strength of the first decision maker, unless ... 
(similar exceptions as in the case of the first concession). The size of iht further 
concessions, in wage units, is equal to the size of the opponent’s last concession, 
unless ... (details are omitted). 

This theory, which we do not present in full detail, completely determines the 
bargaining process. It was tested with a sample of 30 periods of 12 games. 
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Many periods had to be excluded since the theory does not directly apply to 
bargaining over more than one variable and does not cover the effects of central 
bank intervention. The planning difference theory predicted the end result 
correctly in 73% of all cases and the dynamic aspiration balance theory even in 
93% of the sample. In 60% of all cases the dynamic aspiration balance theory 
correctly predicts the complete process up to deviations of .05% (wage increase 
percentages). 

Table 7.1. gives an overview over the empirical evidence for the static aspira- 
tion balance principle and the dynamic aspiration balance theory. 

Even if not always exactly the same theories were tested the overall result of 
these studies is quite impressive. Of course, in view of the importance of the 
subject matter many more experiments are needed before a final judgement can 
be passed. The static aspiration balance theory is quite simple and seems to be 
well supported. The dynamic aspiration balance theory is a complex set of rules 
and it has not always been applied in exactly the same form. One does not find 
equal support for all details. Some cases covered by special rules only rarely 
occur in the data. Obviously, more experimental research is necessary. 
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Table 7.1. Empirical evidence 




Tietz & 
Weber 
1972 



Tietz & 
Weber 
1978 



Scholz 

1980 



Scholz, 

Fleischer 

& 

Bentrup 

1983 



Tietz & 
Bartos 
1983 



CroB- 
mann & 
Tietz 
1983 



Interaction 



verbal formal repeated one shot 




Support 
static I dynamic 



Sample 




^ 4 repetitions 
^ with partner choice 

^ one repetition with seemingly different opponents 
weak predictions of a version adapted to many bargaining variables; equal 
support for dynamic aspiration security theory (not explained here) 

^ better support for planning difference theory 
^ modified version 

^ 60 expert subjects with practical bargaining experience in business and 50 
novices 

^ combined with a proposal search model 



The principle of balanced aspiration levels has the potential to be a powerful tool 
of economic modelling. Unfortunately, up to now the way in which economic 
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behavior is modelled is heavily influenced by empirically unsupported normative 
presumptions. It would be better to make more use of experimentally based 
behavioral theories. 



8. Cooperation in normal form games 

Ostmann (1988) conducted experiments with 3x3x3-games G=(A,,A 2 ,A 3 ;u). 
In the following the notation will be explained and some basic concepts will be 
introduced. 

A| is the pure strategy set of player i (i= 1, 2, 3). 
a=(a,,a 2 ,a 3 ) with a|G A, is a strategy combination of the three players. 
u(a)=(u,(a),U 2 (a),U 3 (a)) is the payoff function. Uj(a) is i’s payoff for a. 
a^=(a.)|gj, is a coalition strategy. The notation a=ac3N\c is used. 

Xc=(Xi)igc is a coalition payoff vector. 

A coalition C can absolutely secure yc=(yi)iec iff if has a strategy ac such that 
for every aN\c Ui(acaN\c) ^ )/\ for all iGC. 

A coalition C can conditionally secure yc=(yi)iec iff for every a^xc it has a 
strategy slq with 
Ui(acaN\c) ^ Y\ for all iGC. 

x=(Xi,X 2 ,X 3 ) is in the a-core iff no coalition C can absolutely secure an 
y^=(y.).g^ with yc^^c and yi^x, for iGC. x=(x,,X 2 ,X 3 ) is in the ^-core iff no 
coalition C can conditionally secure an yc=(yi)iec with yc^i^c and yi^Xj for 
iGC. The /3-core is a subset of the a-core. 

a^xc is a Pareto-best reply to ac iff for no a’f^^c we have Ui(aca’N^c) ^ Uj(acaN\c) 
for iGN\C with Uifaca’^^c) > nj(acaN\c) for at least one iGN\C. B(ac) is the set 
of all Pareto-best replies to ac- 

A coalition C can attain yc=(yi)iec as a leader payoff iff it has a strategy ac 
such that for all a^^c^Bfac) 

U|(acaN\c) ^ yi holds for iGC. 

x=(Xi,X 2 ,X 3 > is in the y-core iff no coalition C can attain an yc=(yi)iec with 
ycT^Xc and yi>X| for iGC as a leader payoff. The y-core is a subset of the a- 
core. 

The minimal core is the intersection of all those cores (a, /3, and y), which are 
non-empty. 

An equilibrium (in pure strategies) is a strategy combination a=(a, , 33 , 33 ) with 
aiGB(aN^i) for i = l, 2, 3. 

A Rawls optimum is a strategy combination a=(a, , 33 , 83 ) with the property that 
for no other strategy combination a’=(a’[,a’ 2 ,a’ 3 ) the inequality 
minimi 2 3 U|(a’) > min|,, 2 3 U|(a) holds. 

The notions of the a- and the /3-core are well known game theoretic concepts 
(Aumann 1961 and Scarf 1967), but the y-core has first been introduced in the 
work of Ostmann. 

Ostmaim conducted an experiment involving various 3 x3x 3-games. The 
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payoffs in the game were in points. Money payoffs were determined by a non- 
linear transformation from points to money, a different one for each player only 
known to the player himself. Therefore, the payoffs in the game were not 
interpersonally comparable. Nevertheless, the players sometimes acted as if 
interpersonal comparability was possible. Bargaining was mostly face to face 
and lasted until an agreement was reached. The agreement could be on the grand 
coalition, a pair coalition or on the fact that no agreement can be reached. The 
coalition agreements were not binding. After the end of bargaining the players 
independently and simultaneously selected their strategies. So, finally this was 
a non-cooperative game. 

In table 8.1. a distinction is made between normal form games with and 
without a pure strategy equilibrium in the minimal core. It can be seen that the 
results are quite different for these two categories of games. An agreement is 
called stable if it is not violated by the final choice. The numbers of stable 
agreements are shown in brackets. The categories in table 8.1. are partially 
overlapping. 

Table 8.1. Experimental results 





Equilibrium in minimal core ? 


Yes 


No 


no coalition 


3 


13 


pair coalition 


18 (17) 


49 (45) 


grand coalition 


95 (92) 


148 (78) 


minimal core 


78 (76) 


78 (33) ' 


Rawls optimum 


83 (82) 


130 (62) 


equilibrium 


75 (74) 


10 (9) 2 


number of plays 


116 


210 


in brackets: agreements not violated by final choice 
^ out of 177 games with non-empty minimal core 
^ out of 135 games with equilibria outside the minimal core 



The results of the experiment can be summarized as follows: 

- only in few cases no agreement is reached 

- most agreements are grand coalitions 

- pair coalitions are very stable 

- grand coalitions are very stable if there is an equilibrium in the minimal core 
but rather unstable otherwise 
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- equilibrium occurs often if there is an equilibrium in the minimal core but 
rarely otherwise 

Cooperation in normal form games is a very interesting subject matter. 
Ostmann’s results indicate that the minimal core is a useful predictive concept. 
In their bargaining the players seem to use arguments related to the a-, /3-, and 
7 -core. It is understandable that an argument related to one of these cores does 
not have much force if the players find out that the concerning core is empty. 
Therefore, it is reasonable to define the minimal core as the intersection of the 
non-empty cores of this kind. 

A more detailed analysis of Ostmann’s data will probably be published in the 
near future. 



9. Duopoly strategies programmed by experienced players 

It is maybe not an exaggeration to say that in the many years after the work of 
Cournot (1838) economics did not yet produce an empirically well supported 
duopoly theory. Selten, Mitzkewitz, and Uhlich (1988) applied the strategy 
method to an asymmetric Cournot duopoly. The strategy method is an 
experimental procedure in which the players first play the game interactively, so 
they experience the strategic structure of the game. Afterwards, each player has 
to develop a computer program specifying a strategy for the game. The strate- 
gies are matched in a computer tournament. A strategy study may involve 
several tournaments between which the subjects have the opportunity to change 
their strategies. 

The strategy method was proposed by Selten (1967) and found further applica- 
tions, inter alia, in Axelrod (1984), Fader and Hauser (1988), Keser (1992), and 
Kuon (1994). 

The study of Selten, Mitzkewitz, and Uhlich concerned a 20-period supergame 
of an asymmetric quantity variation model with linear cost and linear demand. 
The parameters were as follows: 
costs: Cl = 9820+9Xi C 2 = 1260+51x2 

demand: p = max[0,300— X 1 -X 2 ]. 

Here, Xj and X 2 are the production quantities, Cj and C 2 are the costs of duopo- 
list 1 and 2, resp., and p is the price. 

Figure 9.1. shows a graphical representation of the model. 

In this strategy experiment 23 subjects participated in the framework of a 
student’s seminar. The seminar started with three game playing rounds where 
the players interacted anonymously by computer terminals. After this experience 
phase three strategy programming rounds with computer tournaments followed. 
Each subject had to write a strategy for both sides. The motivation of the 
students was by grades. 
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Figure 9.1. Graphical representation of the model 



Figures 9.2. to 9.4. show the results of the game playing rounds. In these 
diagrams each point represents the profits of player 1 and 2 in a play. In the 
first game playing round (figure 9.2.) many players had payoffs below the 
Cournot profits. In the second game playing round (figure 9.3.) this already 
improved, but there are still many profits below the Cournot profit. In the third 
game playing round (figure 9.4.) cooperation was learned to a large extent. 
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Figure 9.2. Result of the first game playing round 




Figure 9.3. Result of the second game playing round 
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Figure 9.4. Result of the third game playing round 



Figures 9.5. to 9.7. show the results of the three tournaments of the strategies. 
In these diagrams each point represents one subject’s pair of tournament profits 
for both player roles. In this respect figures 9.5. to 9.7. differ from figures 9.2. 
to 9.4., in which points represented plays. In the first tournament (figure 9.5.) 
many profits are near the Cournot point and some profits are below Cournot. In 
the second tournament (figure 9.6.) the bulk of the observations is above the 
Cournot profit. In the third tournament (figure 9.7.) full cooperation is not 
achieved, but the profits are close to the Pareto frontier. 





321 




Figure 9.5. Result of the first tournament of the strategies 




Figure 9.6. Result of the second tournament of the strategies 
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Typically, the participants approached the task of strategy construction as 
follows. They knew by their game playing experience that cooperation is profit- 
able. Therefore, they tried to answer the following two questions in this order: 

(1) Where do I want to achieve cooperation? 

(2) What do I have to do in order to achieve cooperation ? 

Typically, the first question would be answered by a pair of production quanti- 
ties for both players which we call the player’s ideal point. Usually, ideal points 
were based on fairness criteria such as maximal equal profits or maximal equal 
additional profits over Cournot profits or maximal profits proportional to Cour- 
not profits. Some strategies had different ideal points for both players and others 
had the same one on both sides. 

The second question concerns the way in which the other player should be 
induced to cooperate at one’s own ideal point. The typical answer to this ques- 
tion is a measure-for-measure policy, which rewards reductions of the other 
player’s quantity in the direction of the ideal point by a similar reduction of 
one’s own quantity and punishes increases of the other player’s quantity away 
from the ideal point by a similar movement away from it. Measure-for-measure 
policies may vary with respect to the way in which the word similar is made 
precise. Similar may mean equal quantity changes or equal percentage changes 
in terms of the difference between Cournot quantity and ideal point quantity, or 
something more complicated, like equal change in profits. 

A measure-for-measure policy can be considered to be a generalization of the 
tit-for-tat rule, which has turned out to be most successful in Axelrod’s (1984) 
tournaments. However, in the case of the prisoner’s dilemma there is no ques- 
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tion about where cooperation should take place. There is only one answer to this 
question which respects the symmetry of the game. Moreover, there are only 
two actions which can be chosen. Therefore, the question does not arise what is 
a similar response to a deviation. Tit-for-tat is a special case of a measure-for- 
measure policy with a very restricted domain of application. 

Typically, a strategy distinguished three phases of playing the game: an initial 
phase, a main phase, and an end phase. The initial phase served the purpose to 
signal cooperativeness by a fixed sequence of decreasing quantities. In the main 
phase a measure-for-measure policy was applied. In the end phase cooperation 
was abolished in favor of some kind of non-cooperative behavior. Of course, 
this is only a rough picture of typical behavior, which, however, is supported by 
an analysis of the final tournament strategies. 

The evaluation of the final tournament strategies identified certain structural 
properties, referred to as characteristics. Each characteristic was present in the 
majority of strategies to which this characteristic is applicable. The list of these 
characteristics follows. 

Typical characteristics of the strategies of the last tournament 

General principles 

1 . No prediction 

2. No random decision 

3. Not only integer outputs 

Initial phase 

4. fixed outputs for the first two to four periods 

5. last fixed output at least 8% below Cournot 

Main phase (Measure-for-measure-principle) 

6. decisions are guided by one or two ideal points 

7. response with own ideal output if opponent’s output below own ideal point 

8. Cournot-output if opponent’s output above Cournot 

9. response to Cournot-output at most 5% below Cournot 

10. reduction as response to reduction (between ideal point and Cournot) 

1 1 . increase as response to increase (between ideal point and Cournot) 

End phase 

12. end phase in the last two to four periods 

13. Cournot output in all periods of end phase 

The first characteristic, no prediction, means that no attempt is made to predict 
the opponent’s quantity in the next period. The oligopoly theories in the litera- 
ture, at least as far as they are known to me, all involve some prediction about 
the opponent’s quantity of the next period. It is therefore surprising that only 
five of the 23 strategies of the last tournament make such predictions. This is 
connected to the fact that typically no attempt was made to optimize against the 
expected behavior of the other player. 
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The second characteristic has the meaning that the strategy involved no ran- 
dom elements. The third characteristic means that typically quantities were 
determined by smooth formula rather than by regions with fixed integer outputs 
based on case distinctions. Casuistic strategies of the latter type were also 
sometimes observed. 

Characteristic 4 has the meaning that in the initial phase outputs do not depend 
on past history. Characteristic 5 means that the initial phase provides a suffi- 
ciently strong signal of cooperativeness. Of course, the reduction by at least 8% 
is to some degree arbitrary, but it has been chosen in such a way that the 
characteristic is shared by a majority of strategies with an initial phase. 

The characteristics 6 to 1 1 for the main phase concern aspects of the measure- 
for-measure principle. The presence of one or two ideal points is covered by 
characteristic 6. Characteristics 7 and 8 restrict response quantities to the inter- 
val between Cournot quantity and ideal point quantity. Some times strategies 
signal cooperativeness by an output below Cournot production. Characteristic 9 
restricts the percentage by which such signalling outputs remain below the 
Cournot quantity. The typical response to reductions and increases of the other 
player’s quantity is covered by characteristic 10 and 11. 

Characteristic 12 describes the presence of an end phase. Characteristic 13 
means that the Cournot output is offered in all periods of the end phase. Some 
strategies behave differently in the end phase, e.g. they gradually increase 
outputs. 

The distribution of the characteristics among the strategies is shown in the 
incidence matrix displayed in figure 9.8. The strategies are ranked according to 
success. Visual inspection of figure 9.8. shows that less successful strategies 
tend to have less of the characteristics. This suggests a connection between 
typicalness and success. However, it would be wrong to measure the typicalness 
of a strategy by the number of its characteristics, since the characteristics should 
be weighted according to their typicalness which in turn should depend on the 
typicalness of the strategies in which they occur. In order to justice to these 
considerations the typicalness of strategies and characteristics is measured by 
numbers between zero and one, called typicities, defined by the following 
principles. 

1 . The typicity of a strategy is the sum of the typicities of its characteristics 

2. The typicity of a characteristic is proportional to the sum of the typicities 
of the strategies with it 

3. The typicities of the characteristics sum up to 1 

4. The typicities are Eigenvectors connected to the greatest Eigenvalue of the 
Eigenvalue problem resulting from 1., 2., and 3. 

A definition of the typicities as a least-squares approximation and questions of 
existence and uniqueness are discussed in Kuon (1993). 
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Figure 9.8. Typicity of characteristics and strategies 

The Spearman rank correlation between the success and the typicity of a 
strategy is .619, which is significant at the 1% level (two-sided). This means 
that the more typical the strategies are the more successful they tend to be. 

The typical approach to the strategic problem is fundamentally different from 
that of traditional oligopoly theories including applications of non-cooperative 
normative game theory. These theories assume that a player has quantitatively 
specified expectations about the other player’s behavior and tries to optimize 
against them. Contrary to this, a typical strategy does not involve any quantita- 
tively specified expectations and does not try to optimize. Of course, some 
vague qualitative expectations are connected to an ideal point and a measure-for- 
measure policy. The strategy designer expects that his strategy will be successful 
in achieving his cooperative goal. However, the cooperative goal is not deter- 
mined by a procedure which maximizes the player’s short run or long run 
expected profit, but rather as the result of the application of fairness criteria. 
Similarly, a measure-for-measure policy is not derived as a solution of an 
optimization problem. 

Typically, a strategy designer takes an active approach to the strategic prob- 
lem. It is seen as the task of the strategy to exert an active influence on the 
opponent’s behavior. A more passive optimization approach would need quanti- 
tatively specified expectations, but the active approach completely avoids them. 
The strategy designer first fixes a cooperative goal and then looks for a way to 
induce the other player to conform to it. The usual answer to this inducement 
problem is a measure-for-measure policy which administers the right kind of 
rewards and punishments in order to guide the opponent’s behavior towards 
one’s own cooperative goal. 
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The structure of a typical strategy shows that experienced subjects approach 
the strategic problem in a rational way. However, the rationality underlying 
their behavior is not the full rationality of Bayesian decision theory and game 
theory, but rather a non-optimizing bounded rationality of goal formation and 
goal pursuit. 



10. Conclusion 

In this lecture I have tried to give an impression of the nature of descriptive 
theories applicable to problems of cooperation. The typical behavior of experi- 
mental subjects is by no means irrational. It is based on its own rationality, 
which is quite different from that of normative decision and game theory. Of 
course, up to now we have only a limited insight into the ways in which strate- 
gic problems are approached by human decision makers. However, it can be 
hoped that in the next decades our knowledge in this area will grow rapidly. 

I hope that it has become clear that the few behavioral theories already avail- 
able have a great potential to be fruitfully applied in economic modelling. 
However, it is necessary to develop many more such theories until finally a 
coherent picture of boundedly rational human strategic behavior will emerge. Of 
course, this will require much painstaking experimental and other empirical 
work. 

Reality is still full of undiscovered regularities. We must discover these 
regularities. 
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